I have the entire brain of a fruit fly on my laptop.
…
Well, to be more specific, I have a map of every connection between one neuron and another in the brain of a fruit fly, including the strength of those connections.
On my laptop.
For the past couple of weeks, I’ve been trying to figure out ways to make this raw data come to life in the form of something visual. Something I can see. Something I can show off to friends at parties, because obviously I’m the type of guy that gets invited to parties all the time.
The fruit fly brain has roughly 140,000 neural cells, and between them there are 34 million synapses (connections).
In storage terms, that might not seem like a lot. In computing terms, that’s a really big deal. Merely mapping every connection between one and another neuron can make for a lot of extra data, memory and compute cycles.
For reference, the human brain has 100 billion neurons, and somewhere in the range of 100 trillion to 1 quadrillion synapses. I’m not running that particular simulation on my laptop any time soon.
I’m not trying to see a 1:1 view of the general shape of the fruit fly brain; just the connections, neuron to neuron to neuron.
Why am I doing this? Uhh next question please.
You’d be surprised at how small the raw data for this is: 64mb or thereabouts. It’s one big long CSV file with 3 columns: the “from” neuron, the “to” neuron, and the weight of the connection, i.e how many synaptic connections there are between the two, and some 4 million rows.
As a table, it looks kinda like this:
| fromNode | toNode | weight |
| m125 | m81905 | 2 |
| m527 | m2692 | 35 |
| m876 | m151 | 8 |
... 3 million, 999 thousand, 997 more times
“fromNode” and “toNode” each represent a single neuron by an ID: “m” for male, and then a number.
Okie dokie, shouldn’t be too hard to make it show up on a graph right? I mean they’re just dots and lines, surely?
Here is my first working attempt at doing exactly that:
That, my friends, is supposed to be a very zoomed in portion of a gigantic 500mb SVG file. An SVG file is a special kind of image where everything is represented as mathematical expressions, rather than by pixels. The layout is pure random, so there’s no ordering of the nodes, which are represented as little dots of cyan.
This attempt to display it in Safari - perhaps the most performant tool I had available for such a task - failed miserably. A 500mb SVG file, when attempting to convert it into pixels, uses more RAM than I’ve ever personally owned in my entire 37-year-long life.
Still, it spent over an hour trying to render just one tiny square portion of the full image. It never even managed that.
I did render a PNG version of it - just pixels this time - but the visibility was not much better, thanks to the utterly random placement of all the neurons.
I hadn’t come this far just to give up now. In fact, once I realised I’d figured out enough of the Julia programming language to generate these plots, I tried something with a physics-based layout system instead of random - what I had originally been attempting to do - and although it took about 2 hours to compute just the arrangement of the neurons alone…
It did finally, eventually, come out with something.
Something fluffy and fuzzy, but unmistakably a thing!
This is the entire connectome of a fruit fly, computed, generated, and visualised, on a laptop.
It’s merely the beginning of what I’ve been hoping to achieve. The next step is to find a layout algorithm that can arrange neurons into more logical groups by their connectivity, like neighbourhoods of function, followed by directional lines between nodes, and the weight shown as the thickness of the lines.
Eventually, though, what I’d really like to do is have a way to simulate action potentials and watch them flow, beginning at one neuron, and flowing through the network, and render it as an animation.
None of this is new, mind you. There are animations out there which show such simulations. I just wanted to do it by mah damn self.
I also found this little neuron sittin waaaay out on his lonesome.
The code is pretty simple. I’ve never done any kind of data science before, although I am a professional software engineer. If I can do it, I reckon just about anyone can, too. I mean, why not?
If this project gets anywhere, I’ll make a little github. For now, you’ll have to be satisfied with this:
using CSV
using DataFrames
using Graphs, SimpleWeightedGraphs
using LightGraphs, GraphPlot, Plots
# Read CSV files
male_edges_df = CSV.read("male_connectome_graph.csv", DataFrame)
# Function to strip non-numeric characters and convert to integer
function strip_and_parse(s::String7)
numeric_str = replace(s, r"\D" => "")
return parse(Int, numeric_str)
end
cols = eachcol(male_edges_df)
fromNodes = map(strip_and_parse, cols[1])
toNodes = map(strip_and_parse, cols[2])
weights = cols[3]
s = SimpleWeightedGraph(fromNodes, toNodes, weights)
plot = gplot(s)
display(plot)
The data is the connectome data set from the Flywire project.
They also have web-based tools that you can use to explore the connectome yourself, without having to get your hands dirty like me. Highly recommended.
This is not my usual kind of post, but if you want more like this, please do let me know and I will give the people what they want!
Okay, that is cool!!
Fun stuff. I have no clue if there are attributes to each point in the data set your working with, but if not and if colorizing it is a next step your considering then maybe do it by calculating the centroid and use vector(I,j,k)from that to select a 16-bit rgb color?