I am back to working on my father’s French epic manuscript index. I use “Epic” in the literal sense; stories about Charlemagne and his cast of characters.
I’ve gotten frustrated at getting my father to do some of the work, so I’ve decided to make an end-run around him and just try and do it myself. Which I probably should have done in the first place. In the past, I’ve gotten some of the data into a website, but that is mostly the actual libraries and the manuscripts. I still have to get the contents of those manuscripts.
I have all the files that he wrote thirty years ago. The first problem is that these files are written in something called “Waterloo Script”. There is nothing I’ve been able to find that reads those files or converts them to anything modern. I have not even been able to find documentation of it. So I will have to write a converter myself.
The files are basically text with a lot of codes to do different things. I have to extract the data out of it and eventually put it into SQL. However, I’m going to put it into XML as a middleman first and then use XSLT to convert as appropriate. That way I can convert to HTML easily to see how things are going. (I know that’s a lot of acronyms, but they make sense.) I am tempted to try and write something that will be generic and handle any Waterloo Script file, but that may be too ambitious.
My first task was to pick a language to do the parsing in. This was important.
I have most of my experience in C# or Swift these days, but those are full featured languages tied to Microsoft and Apple respectively. And I don’t know how long they will last, as they are specific to an operating system. I need to use an interpreted language that can be used on any machine. I was looking at this as a learning opportunity to get better with a language I don’t use much.
My first thought was Perl. It is an old language but still used and is well known for being good at text parsing. Sounds ideal. I even have the Camel book that is considered its bible. I haven’t looked at it in 15 years, but an old language doesn’t change much.
The other thought was PHP. It is a language more devoted to web servers and is what I will be using to serve the website eventually, but it can be used for parsing as well. It just isn’t known for it.
I could do it in Ruby which I’m learning, but that is VERY tied to web servers and will come with too much baggage that I don’t want.
I wasn’t sure which to use. I looked online for advice. But most articles seemed to be more religious than secular. People are very devoted to a language they have spent time learning.
For Christmas I wanted to start work on this project. But I didn’t want to take a laptop around with me during my vacation. I have a new iPad Pro with a keyboard; can I do work on that? I even have an app, Coda that should allow me to edit files. Unfortunately, the iPad doesn’t have a lot of compilers on it.
So, I loaded a whole bunch of files on to my web server. The plan was that I could use the iPad to edit the files remotely and then run them off my server. I wouldn’t be able to do development on an airplane, but who am I kidding? There is no way I would be THAT motivated.
I did some experiments to prove it could work. And I had a magnificent failure. The concept worked in practice and I was able to do exactly what I had theorized. However, I learned something else that changed everything.
In this day and age, being a programmer is fairly language agnostic. Languages have evolved a lot, but mostly in parallel. Many are derived from C/C++ and if you know the basics, it is easy to move from one to the other. (The valuable software development skills are more along the lines of being able to think properly.) That was the thought process I had going in. Although Perl was based off of C, it was nearly 30 years old. It did a lot of things in a way that no longer felt natural. It would not be easy to do development in it.
PHP was much closer to C/C++. That would be the better way to continue.
But then I realized, there is an even better option. If I wanted to use something not too foreign, why don’t I use Javascript? I’ve developed other things in it. I hadn’t thought of it before because it is a language for running off a client machine, not a server. But, it is powerful, and it will also run in any web browser. And my iPad has a web browser.
Using Coda, I was able to get the data on to the iPad and run it from there. Then I could actually use it on a plane if I wanted. The biggest issue was accessing the files wasn’t natural. Every other language had ways to open a read files from a file hierarchy. Because Javascript assumed it would be running on a client machine it could only access them from a URL, but I could work around that.
It has so far been working quite well.