DealForge autonomously sources, scores, and writes investment memos on venture deals. Stop manually hunting.

1,180+ deals tracked  ·  22 AI investment memos  ·  Updated daily

← Back to leaderboard

WhiskeySour

Show HN: WhiskeySour – A 10x faster drop-in replacement for BeautifulSoup

45 AI Score
Show_hn devtools Added Apr 25, 2026

Details

Sector
devtools
Total Funding
$0
Last Round
$0

About

The Problem<p>I’ve been using BeautifulSoup for sometime. It’s the standard for ease-of-use in Python scraping, but it almost always becomes the performance bottleneck when processing large-scale datasets.<p>Parsing complex or massive HTML trees in Python typically suffers from high memory allocation costs and the overhead of the Python object model during tree traversal. In my production scraping workloads, the parser was consuming more CPU cycles than the network I&#x2F;O. Lxml is fast but again uses up a lot of memory when processing large documents and has can cause trouble with malformed HTML.<p>The Solution<p>I wanted to keep the API compatibility that makes BS4 great, but eliminates the overhead that slows down high-volume pipelines. It also uses html5ever which That’s why I built WhiskeySour. And yes… I *vibe coded the whole thing*.<p>WhiskeySour is a drop-in replacement. You should be able to swap from &quot;bs4 import BeautifulSoup&quot; with &quot;from whiskeysour import WhiskeySour&quot; and see immediate speedups. Your workflows that used to take more than 30 mins might take less than 5 mins now.<p>I have shared the detailed architecture of the library here: <a href="https:&#x2F;&#x2F;the-pro.github.io&#x2F;whiskeySour&#x2F;architecture&#x2F;" rel="nofollow">https:&#x2F;&#x2F;the-pro.github.io&#x2F;whiskeySour&#x2F;architecture&#x2F;</a><p>Here is the benchmark report against bs4 with html.parser: <a href="https:&#x2F;&#x2F;the-pro.github.io&#x2F;whiskeySour&#x2F;bench-report&#x2F;" rel="nofollow">https:&#x2F;&#x2F;the-pro.github.io&#x2F;whiskeySour&#x2F;bench-report&#x2F;</a><p>Here is the link to the repo: <a href="https:&#x2F;&#x2F;github.com&#x2F;the-pro&#x2F;WhiskeySour" rel="nofollow">https:&#x2F;&#x2F;github.com&#x2F;the-pro&#x2F;WhiskeySour</a><p>Why I’m sharing this<p>I’m looking for feedback from the community on two fronts:<p>1. Edge cases: If you have particularly messy or malformed HTML that BS4 handles well, I’d love to know if WhiskeySour encounters any regressions.<p>2. Benchmarks: If you are running high-volume parsers, I’d appreciate it if you could run a test on your own datasets and share the results.

AI Score Reasoning

WhiskeySour addresses a legitimate performance bottleneck in the massive web scraping market with a high-utility 'drop-in' replacement strategy. However, it currently lacks the traction, team depth, and clear monetization path required for a venture-scale investment, appearing more as a high-quality open-source tool than a business.

Source

Show_hn — View original →