To Read Articles
Just for fun! A friend and I started developing a non professional web application called Structies, with the purpose of approaching people to the most popular data structures, in the way of being able to visualize their main operations and interact with them in a friendly way.
Are Dockerfiles good enough?
For those looking for a fast overview of containers click here. Containers have quickly become the favorite way to deploy software, for a lot of good reasons. They have allowed, for the first time, developers to test "as close to production" as possible.
“Find the Difference” in Python
Recently, I have been focusing on reviewing and trying out Python built-in libraries. This self-assigned task has brought me a lot of fun. There are many interesting features in Python that provide out-of-the-box implementations and solutions to different kinds of problems.
I found the best anagram in English
This was easy to do, even at the time, when the word list itself, at 2.5 megabytes, was a file of significant size. Perl and its cousins were not yet common; in those days I used Awk. But the task is not very different in any reasonable language:
A dive into spatial search algorithms
I’m obsessed with software performance. One of my main responsibilities at Mapbox is discovering ways to make our mapping platform faster. And when it comes to processing and displaying spatial data at scale, there’s no concept more useful and important than a spatial index.
The Olympics: How to Build a Linked Data Applica...
In 2012, the BBC famously used linked data to support coverage of the London Olympics on its website, app, and interactive video player. They have continued to champion the benefits of semantic technologies to this day.
Who says using RDF is hard?
The Linked Data ecosystem and RDF as its graph data model have been around for many years already. Even though there is an increasing interest in knowledge graphs, many developers are scared off by RDF due to its reputation of being complicated.
10 rules for better dashboard design
Dashboard design is a frequent request these days. Businesses dream about a simple view that presents all information, shows trends and risky areas, updates users on what happened — a view that will guide them into a bright financial future.
Predicting Occupancy on the Belgian railroads, b...
Pieter Colpaert, the #opendata specialist in Belgium did a call to the community to try and predict which train will have high occupancy. I am not a Data Scientist, but decided to give it a try on Azure Machine Learning, my employer’s Machine Learning offer in the Cloud.
Obsessive Curiosity & Uncompromising Idealism
Moments die quickly. So when something interesting floats by in the world or in my head, I like to write it down before it evaporates. I don’t know what prompted me to write either of these notes. Finding them was like stumbling into someone else’s journal.
You don’t need to work on hard problems
College, 2012—Internship recruiting season. “What are you looking for in your internship?” the recruiter asks. “I’d like to solve hard technical problems,” I reply. I end up at Jane Street writing software to calculate numeric integrals of a function that is costly to evaluate.
Linked Data Notifications
Status Update (May 2017): Links in the overview diagram were fixed in-place on May 22 2017: they were pointing to incorrect internal anchors. Status Update (Sep 2017): A concept URI was fixed in-place on Sep 5 22 2017: it was pointing to incorrect internal URI.
How I automated the boring University stuff with...
Hello, I'm a second year CS Undergrad from India. I love Python. This is my first article here on the dev.to community. So let's begin! My college has a general student - login, where students can view their profile, upload assignments, get due dates, download course materials, and stuff.
New! Automatically Discover Website Connections ...
A few years ago Lawrence Alexander published a great piece on finding connections between websites using Google Analytics (among others) codes. Last year I had published a post where I taught you how to automatically mine some of this information using Python, and then how to visualize it.
Factorio and Software Engineering
I’ve been a software engineer a while now and I can say this with confidence - it is fun. It’s great and I wouldn’t trade it for anything else. It’s so much fun that some folks try to capture the most enjoyable elements and put them into games. I’ve played two such games.
Factorio and Software Engineering
I've been a software engineer a while now and I can say this with confidence - it is fun. It's great and I wouldn't trade it for anything else. It's so much fun that some folks try to capture the most enjoyable elements and put them into games. I've played two such games. The first is Shenzhen.io.
Factorio Is The Best Technical Interview We Have...
There's been a lot of hand-wringing over The Technical Interview lately. Many people realize that inverting a binary tree on a whiteboard has basically zero correlation to whether or not someone is actually a good software developer.
How Spotify Optimized the Largest Dataflow Job E...
In this post we’ll discuss how Spotify optimized and sped up elements from our largest Dataflow job, Wrapped 2019, for Wrapped 2020 using a technique called Sort Merge Bucket (SMB) join. We’ll present the design and implementation of SMB and how we incorporated it into our data pipelines.
Wanneer staat de eerste Belgische fintechunicorn...
Terwijl het lijkt alsof het ene Europese financiële technologiebedrijf na het andere deze zomer de miljardenstatus bereikt, blijft het vooralsnog wachten op de eerste Belgische eenhoorn in de sector. Hebben we de trein gemist, of zit er toch nog muziek in onze fintech?
Generating MODS XML from RDF with Go templates
I had heard that Go (also known as “golang”) was an increasingly popular newish programming language before I migrated my blog from being generated by handmade XSLT scripts on snee.com to using the Hugo platform to generate it on bobdc.com.
Linking different knowledge graphs together
Lately I’ve been thinking about some aspects of RDF technology that I have taken for granted as basic building blocks of dataset design but that Knowledge Graph fans who are new to RDF may not be fully aware of—especially when they compare RDF to alternative ways to build knowledge graphs.
Pulling Turtle RDF triples from the Google Knowl...
When I wrote about my first deep dive into Knowledge Graphs, I mentioned that although the term was around well before 2012, the idea of a Knowledge Graph was blessed as an official Google thing that year when one of their engineering SVPs published the article Introducing the Knowledge Graph: thing
What is RDF?
I have usually assumed that people reading this blog already know what RDF is. After recent discussions with people coming to RDF from the Linked (Open) Data and Knowledge Graph worlds, I realized that it would be useful to have a simple explanation that I could point to.
LaTeX Thesis Skeleton
As it might be useful for other students (especially for computer science students at the University of Kaiserslautern), I decided to invest some time and create a skeleton for a thesis. The project can be found on github: http://github.com/joernhees/thesis-skeleton.
What's the best RDF serialization format?
Contrary to some other datamodels, RDF is not bound by a single serializiation format. Triple statements (the data atoms of RDF) can be serialized in many ways, which leaves developers with a possibly tough decision: how should I serialize my linked data?
Full-stack linked data: lessons from building an...
How might a web application work if it exclusively uses Linked Data (RDF) to communicate between server and client? In this article, I’ll tell you about our API journey, why we chose linked data (RDF), the challenges that we faced, and some of the solutions that we came up with.
Tutorial: Building a React front-end app with RD...
A Slice of Pi: Some Small Scale Experiments with...
There was a time when RDF and triplestores were only seen through the lens of massive data integration. Teams went to great extremes to show how many gazillion triples per second their latest development could ingest, and large integrations did likewise with enormous datasets.
Smart City Ontologies: Lessons Learned from Ente...
For the last 20 years, Semantic Arts has been helping firms design and build enterprise ontologies to get them on the data-centric path. We have learned many lessons from the enterprise that can be applied in the construction of smart city ontologies. Utility companies. Sanitation companies.
Processing 40 TB of code from ~10 million projec...
The command line tool I created Sloc Cloc and Code (scc) (which is now modified and maintained by many other excellent people) counts lines of code, comments and makes a complexity estimate for files inside a directory. The latter is something you need a good sample size to make good use of.
Why I Turned Down My Silicon Valley Dream Offer
Ben E. C. Boyter's Blog 2018 was not a good year for me. To clarify that, it was actually the worst year of my life so far. It started well. I had the opportunity and need to learn the programming language Go which I did and then released scc https://github.
The case for learned index structures – Part II
The case for learned index structures Kraska et al., arXiv Dec. 2017 Yesterday we looked at the big idea of using learned models in place of hand-coded algorithms for select components of systems software, focusing on indexing within analytical databases.
Elasticsearch caching deep dive: Boosting query ...
Cache is king for speedy data retrieval. So if you’re interested in how Elasticsearch leverages various caches to ensure you are retrieving data as fast as possible, buckle up for the next 15 minutes and read through this post.
Stream Processing and Probabilistic Methods: Dat...
Stream processing and related abstractions have become all the rage following the rise of systems like Apache Kafka, Samza, and the Lambda architecture. Applying the idea of immutable, append-only event sourcing means we’re storing more data than ever before.
An Old Hacker's Tips On Staying Employed
I know my core audience is expecting crusty old tales of computing past, but this time I wanted to talk about something a little more topical — about job security, whether it indeed exists in any form, and what if anything you can do to improve your odds of staying employed.
Semantic Web Technologies on an Example of Famil...
The software capable of logical reasoning within some knowledge domain may seem a tech marvel. However, as it can be seen below, writing such software in Python is not difficult, if one makes use of semantic web technologies.
Print interviews are filed under Publications/Articles. Steele, Robert, with Inteviewers, “Part 1 of 3: Trump Could be the Greatest Ever,” “Part 2: Open Source Solutions Strategy,” “Part 3: The End Game,” (YouTube, each 27 minutes), 27 February 2017.
A book on open source licensing from an engineer’s rather than a lawyer’s perspective that includes a little history of the relevant laws. Why do we have trademarks? Why do we have copyright? Etc. An overview of research findings on open innovation in the enterprise.
Trust is a fundamental element of social capital – a key contributor to sustaining well-being outcomes, including economic development. In this entry we discuss available data on trust, as measured by attitudinal survey questions; that is, estimates from surveys asking about trusting attitudes.
Accelerating Innovation — Uniting Research & Ope...
Open-source software (OSS) underpins all major cloud platforms and a large number of cloud-based services today. Core technologies that power cloud computing –Linux, Kubernetes, Cloud Foundry, Docker, to name a few –are developed in open communities.
DIY Air Quality Sensor
It’s shaping up to be another intense season for wildfires on the West Coast. Last year, we had about 2 weeks where the air quality was bad enough that we needed to limit our time outdoors. It’s been on my mind to stock up with supplies for wildfire season.
Apply conversion functions to data in SQLite col...
Earlier this week I released sqlite-utils 3.14 with a powerful new command-line tool: sqlite-utils convert, which applies a conversion function to data stored in a SQLite column. Anyone who works with data will tell you that 90% of the work is cleaning it up.
Odoo-CEO Fabien Pinckaers: ‘Wij willen slagen wa...
Op een boerderij in Wallonië bouwde ondernemer en ingenieur Fabien Pinckaers stap voor stap zijn softwarebedrijf Odoo uit tot de waardevolste techstart-up van België. De vers verworven eenhoornstatus is slechts het begin. ‘We bedienen nog maar 0,1 procent van onze markt.’
700,000 lines of code, 20 years, and one develop...
Dwarf Fortress is one of those oddball passion projects that’s broken into Internet consciousness. It’s a free game where you play either an adventurer or a fortress full of dwarves in a randomly generated fantasy world.
How I store my files and why you should not rely...
Have you ever lost important data? I have. I learned doing backups the hard way after I lost a entire book I had just finished writing! The year was 2000-something when I had just finished writing a book I had been working on for a couple of years.
Towards Inserting One Billion Rows in SQLite Und...
Current Best: 100M rows inserts in 33 seconds. (you can check the source code on Github) Recently, I ran into a situation where I needed a test database with lots of rows and needed it fast. So I did what any programmer would do: wrote a Python script to generate the DB. Unfortunately, it was slow.
Publishing Data to the Web with Heroku and Datas...
Prerequisites You need to have the following installed on your machine: Python. I recommend using Anaconda to manage all of your python related work. Get Anaconda here and get the 3.7 version. In the instructions below, you will also be installing Homebrew if you are on a Mac.
Bringing the circus into town: the perks of Ligh...
In 1946, residents around the Lockheed Martin factory were presented with a peculiar sight. A large circus tent standing in their parking lot. No lions, or clowns within sight, but still the tent sheltered a different kind of magic.