I Built a LinkedIn Profile Scraper on Apify for Public Profiles, Company Enrichment, and Lead Research

AlwaysPrimeDev — Sun, 29 Mar 2026 11:54:32 +0000

Public LinkedIn data is still one of the most useful inputs for lead generation, recruiting, founder sourcing, and
market research.

The problem is that many LinkedIn scraping workflows have too much friction:

they depend on cookies
they break the moment setup is slightly wrong
they return shallow profile data
they make you wait until the whole run finishes before you can use anything

I wanted something simpler.

So I built a LinkedIn Profile Scraper on Apify that works with public profile URLs, does not require LinkedIn cookies,
and returns structured profile data plus company enrichment and best-effort contact discovery from public company
websites.

What the actor does

You pass in one or more public LinkedIn profile URLs like this:

{
  "profileUrls": [
    "https://www.linkedin.com/in/williamhgates",
    "https://www.linkedin.com/in/satyanadella"
  ]
}

For each public profile, the actor can return:

full name, headline, summary, location
followers and connections
current role and company
work experience and education
recent posts and articles
company LinkedIn URL, website, industry, and size
best-effort email candidates discovered from public company pages

That makes it useful not only for scraping profiles, but for building enriched lead or research datasets.

Why I built it this way

The main design goal was low-friction enrichment.

Instead of asking users to manage session cookies, I focused on publicly accessible profile pages. Then I extended the
output beyond the profile itself:

company details are enriched from public LinkedIn company pages
email candidates are discovered from public company website pages like /contact, /about, and /team
successful profiles are streamed into the Apify dataset as soon as they finish
failed items are kept out of the main result dataset so the output stays clean

That last point matters more than people think. If you are enriching hundreds of profiles, you usually do not want to
wait for the entire batch before the first usable results appear.

Technical notes

The actor is written in Go and uses concurrent workers, retry handling, request timeouts, and proxy support. On the
parsing side, it combines HTML selectors with JSON-LD extraction to get more reliable structured data from public
pages.

On the Apify side, I wanted the actor to feel like a production tool, not just a script:

minimal input
progressive dataset output
export to JSON, CSV, or Excel
easy connection to webhooks, Make, Zapier, n8n, Airtable, Google Sheets, or a CRM

Good use cases

This actor is a good fit if you need:

recruiter snapshots of public profiles
enriched prospect data for outbound
founder and operator sourcing
company and talent mapping
quick public-profile research pipelines inside Apify

Compliance note

This actor is intended for publicly visible LinkedIn data only. It is not meant to bypass authentication walls or
access private profile data. As always, make sure your usage complies with applicable rules, laws, and internal
policies.

Final thought

I did not want to build “just another scraper.” I wanted an Apify actor that turns a public LinkedIn profile URL into
usable structured research data with as little setup as possible.

If that matches your workflow, you can try the actor here:

[Click here]

If you want, I can also share the implementation details or write a follow-up post about the parsing and enrichment
pipeline behind it.

How I Built an Instagram Profile Scraper in Go and Shipped It to Apify

AlwaysPrimeDev — Wed, 18 Mar 2026 21:19:05 +0000

I recently built a small Instagram profile scraper in Go, packaged it as an Apify Actor, and published it so other people can use it without maintaining the infrastructure themselves.

The goal was simple: fetch public Instagram profile data by username and return clean, automation-friendly JSON. I did
not want browser automation, heavy dependencies, or deeply nested output that becomes painful to use in datasets, exports, or pipelines.

The problem

A lot of scraping projects work, but they are hard to operationalize.

They rely on full browser stacks, break on minor changes, or return raw payloads that still need another transformation layer before they become useful. For profile lookups, I wanted something much lighter:

input: one or more Instagram usernames
output: structured profile data
deployment: packaged for Apify
operations: proxy-ready and resilient to partial failures

The approach

I built the Actor in pure Go with no external dependencies beyond the standard library.

Instead of browser automation, the scraper makes a direct request to Instagram’s web profile endpoint and sends the headers that Instagram expects for that request. That keeps the runtime small and fast, which is a good fit for an Apify Actor.

The Actor accepts either a legacy username field or a usernames array, normalizes the input, strips @, and removes duplicates. That makes it easier to use both manually and from automations.

What the scraper returns

The Actor extracts and normalizes the most useful profile fields, including:

username and internal Instagram ID
full name and biography
follower, following, and post counts
profile picture URLs
private, verified, business, and professional flags
related profiles
latest posts

The latestPosts section is where I spent more time than expected. I did not want to return only a shortcode and a caption. I wanted each post to be immediately useful, so I included things like:

caption text
hashtags and mentions parsed from the caption
likes and comments count
dimensions
image URLs
tagged users
child posts for carousel content
normalized timestamps

That way, the Actor output is already useful for lead generation, competitor monitoring, influencer research, and internal dashboards.

Making it practical for Apify

Building the scraper itself was only half the task. The other half was productizing it.

I added:

an Apify input schema for usernames
a dataset schema for cleaner output browsing
a Docker build so the Actor can run consistently
dataset push logic so each profile is saved directly to the Apify dataset
proxy support for more reliable requests at scale

One implementation detail I care about is failure handling. If one username is invalid or unavailable, the whole run should not fail. The Actor skips missing profiles and continues processing the rest. It only fails the run on actual technical errors such as network or dataset write failures.

That matters in production much more than it seems during local development.

What I learned

A few lessons stood out while building this:

First, scraping is only part of the value. Data shape matters just as much. A flat, predictable output is more valuable than a huge raw JSON blob.

Second, operational details matter early. Timeouts, proxy support, and partial-failure handling are not “later” concerns if you want to publish a usable product.

Third, packaging changes how you think. Once I decided to publish the scraper on Apify, I had to think less like a developer running a script and more like someone maintaining a small API product.

Final result

The result is a lightweight Instagram Profile Scraper Actor in Go that can fetch one or many public profiles and return structured output ready for datasets and automations.

If you want to try it without building your own pipeline, you can check it out here:

You can check it here: Link

If you are building scraping tools yourself, my main advice is this: optimize for usable output, not just successful requests. That is usually what makes the difference between a side script and a product.

Forem: AlwaysPrimeDev