Follow

Today I feel the insatiable urge of building a tool to scrape the shit out of and export my timeline into an feed, or at least something that can be easily integrated into the .

I haven't posted anything on my Facebook profile for months nor accessed their website unless some friend shared a direct link to a photo or a video with me. Granted, I feel like the Fediverse is a much healthier place that doesn't make me feel as guilty as if I were chain-smoking and consuming junk food while driving a huge CO2-spewing SUV.

But, even if I met a lot of amazing people here, and I have even managed to bridge Twitter profiles and content from a vast trove of RSS feeds, and I have built bots that take a lot of interesting content directly to my door, and I have even managed to keep messaging my friends on Messenger/WhatsApp through Matrix and Bitlbee bridges, there's still an uncomfortable truth that doesn't make me sleep at night: Zuckerberg is still holding most of my family and friends as hostages, they will probably never move to the Fediverse, and I'm missing out on the lives of my loved ones (as well as on a lot of interesting events happening around me) because that content is behind a huge impenetrable wall.

I'm sick of hearing "Facebook should be compelled to federate, or at least open up their APIs for personal usage, but we don't know where to start". Or "the is great on paper, but it's hard to enforce". If regulators don't take the matter into their hands, then I will. And, if Facebook dares to sue me or lock my account, I'm ready to sue them back for violating the DMA. I'm ready to take this matter in front of courts and spend my money on lawyers, because I want Facebook and their highly immoral "high switching costs" strategy to die amid the worst conceivable pains in this universe - or at least I want them to be forced to open up the data of my loved ones.

I was looking around for some up-to-date Facebook scrapers, but all I could find was this project github.com/kevinzg/facebook-sc (which only scrapes public pages) and some commercial solutions that provide Facebook scraping for profiling and ads purposes (which makes me wanna spit and puke on the people behind those businesses for proudly showing off the worst that a human being can be capable of and making a profit out of it).

I made a Facebook scraper around 10 years ago, but back then their pages were relatively simple, and a bit of beautifulsoup scripting was enough to scrape the shit out of them. I've now taken a look at the developers console while browsing the website, and I've been horrified by how much effort they've put to prevent exactly what I was trying to do - the whole Facebook feed is basically a bunch of <script type="application/json"> tags that download some custom minified JavaScript for each post, that in turn is used to decrypt some other JSON requests.

So I'm appealing to all the hackers and tinkerers out there: are there FLOSS projects that already do what I'm trying to do (basically allow you to sign in to Facebook with your account, get an access token, and scrape posts and comments from your own timeline)? If not, are there any volunteers out there who would like to join forces with me in a new dog-and-cat war with Facebook - starting with reverse engineering whatever mechanism they've put in place to obfuscate the HTML on their timelines?

· · Web · 3 · 6 · 6

p.s. there's something like this apparently: github.com/harismuneer/Ultimat.

The UFS apparently used to be open-source, and now the whole Github project consists of a README that invites people to pay $119 for a license and a Zoom call where the author shares the code and the instructions to install it.

I want to build something really FLOSS also to make sure that these borderline scammers who profit from a real-world need and exploit the open-source community have no bread left to eat.

@blacklight I was recently excluded from university communications at #RUC (in #Denmark) because I choose not to have a #Facebook acct. Another student said: “should we figure out a way to get the FB content to you?” Me: certainly not! I *want* people to be forced to leave the walled garden to reach me in the free world, otherwise I create a co-dependency that supports Facebook.

@blacklight A mechanism by which I could read FB content but not interact with it would strongly encourage me to register at Facebook. I would see content that compels a response & the temptation to respond could easily push registration. Your proposal would be a great recruiting tool for FB. This already happens where i see mirrored Twitter & Reddit posts that i would like to reply to.

@blacklight #Invidious is another example of this. I watch a video that compels me to comment, but I’m silenced unless I create a Google acct. It’s a gross oversight by Invideous creators to make the comms 1-way. If the invideous platform incorporated a free-world way to reply, I would use it even if there would be no guarantee that the video producer sees it.

@blacklight An RSS feed of Facebook without a means to reply is a recipe for disaster. If there is going to be an RSS feed, then there needs to be a free-world reply mechanism linked in.

@expat I'm totally with you on this. Viewing should be the first natural step because it's (in theory) the easiest and the most urgent to implement. But the next natural step should be the ability for bidirectional interactions. From my side I will stop only when Facebook is really federated - either because forced by regulators, or because I've successfully reverse-engineered all the shit out of them.

Individious/Piped are good examples - and something that touches me quite closely, as an admin of a Piped instance and builder of tools based on youtube-dl. I feel like they should really provide the ability for bidirectional interactions (even though they are severely limited by the youtube-dl and the official YouTube APIs), but in the meantime they provide me with a way to watch videos from my favourite creators without Google's tracking machine. Since I consume YouTube content much more than I contribute to it, I feel like I've already done a big step in the right direction.

@expat my proposal would actually be the worst nightmare for Facebook.

Let's be realistic: we can't expect billions of people to give up their main tool for communication that they've used for at least a decade. Even those that I have convinced to sign up to my Mastodon instance only post very sporadically here, because all of their networks are on the other side of the wall and they don't want to miss out. Even if we convince somebody to give up Facebook entirely, we will always be nothing but a small rounding error for the profit of that company.

What REALLY hurts them is somebody who scrapes their content - and that's why they lobbied and keep lobbying so hard against the DMA. I don't have a problem with the content, I have a problem with the container, and the best way to break the container is by freeing up its content.

Once you can subscribe to a Facebook timeline through an RSS feed, and you can respond to a Facebook post without leaving Mastodon, Facebook's added value goes to zero.

They make money by tracking everything they can on their web pages and apps: once we provide a solution that allows people to still communicate as if they were on Facebook, but without feeding their precious data to their tracking and profiling machine and without seeing their precious ads, a lot of people will be much more likely to do the jump - and Facebook's margins for profit can really go down the drain.

@blacklight To be brief, I’ll just say that Mastodon would not be a suitable comment mechanism because it’s already very divided with all the non-transparent blocking that goes on. It would have to be a #fedi-connected centralized node so that all replies could be seen in one place. Or there would have to be a publish-subscribe mechanism on a designated Mastodon node to gather all comments.

@expat I envision something not very different from the bridging that currently provides for Twitter, with the difference that the accounts you follow must be among your friends (or else you'll only be allowed to see their public posts), and comments on their activities need to be automatically bridged back to Facebook.

@blacklight I’m not familiar with Birdsite but anything that’s fed into walled gardens should hopefully not be the complete msg. Ideally it would just be a notice containing a link to a msg available for viewing outside of the walled garden.

@expat @blacklight
I also agree with this viewpoint regarding workarounds such as WhatsApp bridges that allow us to better adapt to the status quo, and thus makes it more difficult to argue for alternatives. The more difficult it is for me to access these services in a safe manner, the better argument I have in convincing others to contact me by a different channel.

@blacklight I'm held hostage by Facebook via my dead friends - If I leave that place before the DMA/DSA kicks into effect, everything about them will be gone from my reach, presumably forever.

Sign in to participate in the conversation
Mastodon

A platform about automation, open-source, software development, data science, science and tech.