BabelFish: Building a Slang-Aware NLP Interface from Backend to Frontend

A technical write-up on building BabelFish, a platform for analyzing word embeddings with a focus on Twitch-style slang

Ferdinand Theil

May 28, 2025

This is a write-up of my project BabelFish, which I’ve been working on intermittently since March. It’s not finished yet, but I figured now is a good time to document my progress and the issues I’ve run into during development.

Currently, BabelFish is a website and API that lets you analyze words, sentences, and word embeddings, with a particular focus on Twitch-style slang. I’m currently overhauling the data collection pipelines and building new models for Twitch and Bluesky. A stretch goal would be to incorporate time into the analysis—showing trends in sentiment over time across different platforms.

Getting Started #

I had previously trained some machine learning models based on a paper called FeelsGoodMan: Inferring Semantics of Twitch Neologisms¹. The goal was to build an interface that let my friends play with the models I’d built. I wanted to make this because I struggled to share and explain what the models did, since they only ran locally on my laptop.

As a backend programmer, this was a great opportunity to build a frontend and try my hand at a more full-stack project.

After some research and conversations with friends who knew frontend tech, I settled on the following stack:

Svelte
FastAPI

Because the models were built in Python using libraries like Gensim² and Scikit-learn³, I needed the API to be in Python. I chose not to build everything in Python, though—I wanted to learn new technologies, and I’d already made things in Flask before.

Also, since frontend work mostly happens in JavaScript, I figured anything I learned from one JS framework would likely transfer to others. Honestly, I don’t have a strong reason for picking Svelte over something else—people just seemed to like it.

Plumbing Up My Backend #

FastAPI turned out to be both a great and terrible choice for this project. Building the initial API was fast and easy. Most of my time went into designing the API endpoints rather than implementing them. I ended up with a design that was at least consistent, though getting response type-checking to work with Pydantic⁴ was a huge headache.

One of Python’s longstanding issues is distribution, so I try to build all my software as Python packages. This wasn’t straightforward with FastAPI.

To summarize: I found this discussion asking for the exact feature I needed. After digging through the code, I found the relevant section:

18def get_default_path() -> Path:
19    potential_paths = (
20        "main.py",
21        "app.py",
22        "api.py",
23        "app/main.py",
24        "app/app.py",
25        "app/api.py",
26    )
27
28    for full_path in potential_paths:
29        path = Path(full_path)
30        if path.is_file():
31            return path
32
33    raise FastAPICLIException(
34        "Could not find a default file to run, please provide an explicit path"
35    )

⁵

This means I can’t run my code based on the module name—it expects a path instead (even though Uvicorn supports modules just fine—FastAPI CLI just doesn’t implement it).

Remembering that Uvicorn had support for this, I checked whether I could call it directly. I was hopeful—there was a closed issue asking for exactly what I wanted. I tried it, but… nothing.

self.should_reload was always False, with no way to override it

281        if (reload_dirs or reload_includes or reload_excludes) and not self.should_reload:
282            logger.warning(
283                "Current configuration will not reload as not all conditions are met, " "please refer to documentation."
284            )

⁶

For some reason, FastAPI tries to infer the module structure from a file path instead of just letting you provide a Python package name. (Uvicorn does support this—it can use the module name to run your code!)

Uvicorn allows you to run code via the module name or object, but it only automatically reloads if it knows the path to watch—something it can’t infer if you use a module name. (Even though you can explicitly provide reload_paths…)

In the end, I gave up on automatic reloads.

I’m sure the rest of the project will go much better. Now it’s time to set up a Svelte project and start accessing my cool new API.

Hey, wait…

What the Hell is CORS? #

CORS, or Cross-Origin Resource Sharing, is a security layer that lets servers specify which origins (i.e., domains or servers) are allowed to access their resources. It exists to prevent attacks where malicious JavaScript tries to fetch and load data from a server the attacker controls. ⁷

This is called a Cross-Origin Request, which happens to be exactly what I’m trying to do.

A screenshot of firefox console showing a Cross-Origin-Request Error — Cross-Origin-Request Error

After spending way too long messing around with Access-Control-Allow-Origin and Origin headers, I reached out to a friend to see how they deal with CORS. They pointed out that this is actually a really awful way of accessing the data and it’s far better to route the request through Vite using a proxy.

const upstream = {
	target: 'http://localhost:8000/',
	secure: true,
	changeOrigin: false,
	logLevel: 'info',
  };

export default defineConfig({
	server: {
		allowedHosts: true,
		proxy: {
			'/api': upstream,
		},
	},
	plugins: [sveltekit()]
});

It’s that simple. Now all requests that fetch /api will be forwarded to my upstream API and I managed to sidestep the problem this time.

Hey Svelte is pretty fun #

Now that I’ve gone through the awful process of setting up these frameworks, programming in svelte is actually pretty nice. I used RealTime Colors⁸ to pick out some themes and spent an afternoon going through different blogs and ~~stealing~~ borrowing some design ideas. This included using the @media (prefers-color-scheme) to setup automatic dark/light theming, along with some scaling options using @media (min-width).

After spending some time learning some basic svelte via their tutorials, I was able to start winging it and setting up an interactive frontend. It’s surprisingly easy and intuitive!

For something like this a search box, you could use something like this.

<section class="sentiment-search">
    <h4>Sentiment analysis</h4>
    <!-- <textarea bind:value={query} placeholder="Enter text here..."></textarea> -->
    <input bind:value={query} placeholder="Enter text here..." />
    <button on:click={fetchResults}>Analyze Sentiment</button>

    <progress value={compound_score} max=2></progress>
    <p>Sentiment Score: {compound_score.toFixed(2)}</p>
    <div class="response-box">
        <p>{response_text}</p>
    </div>
</section>

$: if (query) {
    clearTimeout(timeout); // debounce
    timeout = setTimeout(fetchResults, 300);
} else {

    results = [];
}

Svelte is fun.

All in all, This process come together pretty well over the course of a few days. I’ve added a bunch of fun new features such as an interesting error page and some new visualisations. At the moment I’m still working on building a new dataset with modern data. Maybe I’ll come back and update this blog post!