Genetic programming (GP) heavily relies on existing time series data. In this post I am going to look into different requirements and problems related to data.
First, we need to get the data from somewhere. There are different commercial or free data providers. Here is a list of free data providers.
This is a wordcloud of G. Deleuze’s and F. Guattari’s A Thousand Plateaus: Capitalism and Schizophrenia created with this nice tool. I removed all words with less than a hundred occurrences, some abbreviations and otherwise not very expressive words such as also, thus, and etc. This is where I borrowed the term rhizome from. Enjoy!
I am always subtly amused when opening a book and one of the first pages encountered states that this page is intentionally left blank. Because, of course, it isn’t. There’s a statement printed on it. The situation reminds me of a first-time meditator deliberately trying to empty his or her mind of all thoughts - because, that’s how meditation is supposed to work, isn’t it. At least according to the first-time meditator’s belief. Yet, the more we try to empty our mind, the more we notice how distracted we actually are.
At a first glance, rhizomes may have a lot in common with existing technologies. Yet, when taking a closer look, there are important differences, and it is not possible to simply reduce a rhizome to one or another existing technology. In this post I will quickly compare rhizomes to a variety of different mathematical and computational concepts and data structures.
This post demonstrates how it is possible to use rhizomes to store simple HTML.
Consider the following HTML.
<html><head></head><body></body></html>
How could we store this in a rhizome? First of all, it would make sense to treat every HTML tag as an atomic symbol. There are three such symbols in the sample: html, head and body.