Today I started working with Kerrie on a new web app: MobiDick.
Kerrie is a good friend of mine who has been trying to figure out what to do with her life - and considering getting into tech - for a while.
Over the past nine months or so, she’s taught herself a good bit of programming.
Like all new programmers, though, she ultimately needs to make the leap from “someone who can code” to “someone who can develop software”.
The only way I know how to do that is to try doing a “full size” project, with all of the bells and whistles and pains.
Hence, MobiDick. The idea is simple: why doesn’t Project Gutenberg have a “Send to Kindle” button?
And why do I have to store the ebooks that I purchase outside of Amazon, but ultimately upload to Kindle - like those from the Pragmatic Programmer - on DropBox?
Why can’t I have a nice web interface for managing my non-Kindle-store books?
This is a nice sized project for a beginner, for a few reasons:
First, it’s of appropriate scope. You can get started with a single form and a listing.
Second, there are lots of directions to take it.
Implementing a fully featured site will require all the key web application patterns, including authentication, file uploads, email notifications, regular notifications, comments, ratings, and the like.
But there is room to do more! Just in our first planning session APIs to interact with a mobile app, background processing books to convert their file format, and actually figuring out the .mobi format so that we could make edits to the books came up.
Plus the idea of writing a chrome extension to automatically send fan fiction to your kindle.
There’s lots of opportunity to implement cool, challenging things, beyond your standard Twitter clone.
Third, it’s dear to Kerrie and my heart. Why build another Pinterest clone when you can build something that actually fills a need you personally experience?
Kerrie’s goal, as far as I know, is to learn more about web development, and develop as a programmer.
My goal is to refresh my knowledge of Rails and early project development, and of my methods of teaching.
Although I’ve onboarded several users onto existing applications over the past year, I haven’t worked with anyone as novice as Kerrie, and I haven’t done any greenfield projects - I want to make sure my skills stay up to date, and I find that teaching is the best way to do it.
The project will simulate, as accurately as possible, the real experience of being a programmer working on a small team for a medium sized web project.
We’re recruiting a friend to serve as the product owner, and pedantically reject our stories in Pivotal Tracker.
We’ll be doing user tests with some other friends who are kindle addicts.
And the app will ultimately be deployed to Heroku just like any other real application.
Speaking of Pivotal, our first planning session was purely on the topic of Pivotal Tracker.
Pivotal Tracker is the project management tool offered by Pivotal Labs.
I’ve been using Tracker for almost two years now on all of my professional projects.
Introducing Tracker almost always has a learning curve, but introducing it at the same time as Agile is even more of a shock.
Kerrie has never worked on a real project before, and hence has no familiarity with Agile or Scrum.
I made the mistake of introducing Tracker first, then sort of talking about Agile once we had it open; I think it would have been more effective to introduce Agile first.
The key points we covered were:
You never know less about your project than you know right now
What is a user story
Good user stories have context, are precise, unambiguous, and concise.
A brief overview of points / velocity.
The states of a story - unstarted, started, finished, delivered, accepted / rejected.
We probably could’ve skipped points for now, but since I was introducing Tracker, had to discuss them, as Tracker won’t let you start a story without assigning it a point value.
One note: The new tracker intro video, which Kerrie saw when she created the project, was very good.
The explanation is much better than when I first started using Tracker.
That said, I do not know why Tracker still defaults to a split “Current” and “Backlog” view.
I always combine the views, and Kerrie initially was confused when estimating a early-in-current story at 8 points pushed sevearl other stories off of current, and hence, off of her screen.
It definitely was not intuitive at the time - I think this part of their onboarding could be improved.
We then sat down to write stories.
I do my best to offer guidance, instead of taking over the situation.
We started with some great ones:
As a user, I can log into my account
As a user, I can scan the bookshelf
As a user that has read a book, I should be able to rate the book
As a user that has selected a book, I should be able to download and read the book on my Kindle
First comment: Why is everyone’s first story user log in? I suppose it’s obvious
We then started fleshing these stories out.
This will, of course, be an ongoing process.
I tried to focus my prompts in two types:
First, disambiguating nouns. “As a user, I can scan the bookshelf” is a great start to a story - I actually think scan is a particulary good choice of word, because it helped us come up with some more interesting ideas of what “scan” means - but it prompts questions about what is a bookshelf.
So, we ended up with some new stories:
As a user, I want to be able to scan thee public bookshelves
As a user, I want to be able to view my own personal bookshelf
I also pointed out the question: where do books come from? So,
As a user, I should be able to upload a book
I then went ahead and added - I should not have done this -
As a user who is uploading a book, if the book is not in .mobi format, I see an error
In retrospect, it would’ve been better to let us discover this naturally.
That said, I’m trying to balance expedience of teaching with the teaching, so hopefully it will not be too worthless.
We also came up with some fun reach stories:
As a system, I can automatically convert any file into a mobi file
As the system, when the book is uploaded, I can categorize it
As the system, I can create comments within the mobi book on the front page
That last one is my favorite - the idea being that it’s the same as writing in the front cover of the book, if you then lend it to others
We also ended up with a couple that will need fleshing out, like “As a user, I can receive notifications” (notifications of what? from whom?)
The final list we ended up with as our current section, in order:
As a user, I want to be able to scan the public bookshelves
As a user, I want to be able to view my own personal bookshelf
As a user, I should be able to upload a book
As a user who is uploading a book, if the book is not in .mobi format, I see an error
As a user who is uploading a book, I can set the book’s title
As a user who is uploading a book, I can set the book’s author
And that was the end of our 90 minute work session. I’m really looking forward to implementing these stories!
For the past ~ year, I’ve been working full time on helping students save money on college tuition with Quottly.
Quottly is a rails application, and far from my first - I think my first Rails app was Rails 3.0, build working off of a book written for 2.1,
and now we’ll shortly be upgrading to Rails 5.
One thing has previously never bothered me about ruby and rails, but now is: the lack of, depending on your perspective, homoiconicity or idempotent database operations.
Here’s the situation. Traditional applications have a sharp divide between data and code. Things that go in databases are data.
Things that the programmer writes, and that are stored in your version control system, are code.
This works pretty well for most applications.
I have a users table in my database, and if I need different code to work with some subset of users, I just give them an appropriate class in my class hierarchy.
When the user’s detail changes, I just update the appropriate database row.
Quottly, on the other hand, is sort of an odd beast. There are lots of objects that, in some senses, straddle the line between data and code.
Let’s take a university as an example. Quottly matches college students with the best classes for them across all schools, so we have to store some information about each university that we work with in our database.
What is a university? Parts of it - for example, let’s say the current price of a credit hour, are clearly data. But this “data” has some interesting properties:
One, there can only ever be one instance of a university. There is only one University of Florida, and it is uniquely named; if I ever have two instances of the University of Florida in my database, something is very wrong. And the existence of the University of Florida has nothing to do with its existence in my database, it is, in some ways, a property of the world.
Second, there are bits of things that look very much like code associated with each university. For example, the University of Florida has a rule that you must complete the last 30 credits of your degree at UF. Is that data, or code? I can convert it into “pure data” by making some sort of LastCreditsRule class, putting a row in my database with 30 as the value for “number of credits”, and associating it with the database row that’s associated with UF. But, I could equally (and in many ways, more easily) define that rule with a small bit of code - in effect, that LastCreditsRule is a Verb in what should properly be a Kingdom of Nouns.
I can combine this with a whole lot of wrappers - or some nonstandard active record modifications - to ensure that the create and update operations for the University model in Rails both correspond to what I would describe colloquially as ‘create-or-update-as-appropriate’.
If I wanted to, I could equally implement the University of Florida functionality by making a University of Florida class, and defining a bit of code in it that would implement the last-30-credits rule. In this case, it’d be pure code - the University of Florida would be a singleton class, information about which is only stored in our version control system, and which would require a redeploy to fix.
Neither of these solutions is very appealing. The first, the option in which we treat the university as pure data, involves creating (potentially) a whole lot of Verb classes (things ending in Rule or Policy), which are a code smell. There’s going to be a lot of dumb classes floating around, and that whole class hierarchy will be hard to maintain.
The second option is just not very scalable - if we expand to all 3,000 universities in the US, I’ll end up with 3,000 files in my /models/universities folder? But, that solution does have the nice option of letting me easily grab the object I want by its globally unique name whenever I want, which can be convenient.
There is a third, halfway option, in which ruby code is shove into a database - making it into data - but then is eval()’d out of the database to implement the -Rule classes. This has the advantage of making the Rule class reasonable, constrained, and somewhat maintainable, as the class won’t need to change, but has other disadvantages (security issues being one of the major ones, in addition to inelegance, and a lack of reliability and ability to test easily).
I believe that (rare) situations like this are where homoiconic languages show their value. Homoiconic languages “allow all code in the language to be accessed and transformed as data, using the same representation” - that is, to the programmer, there’s no difference between data and code. The languages in the lisp family are the only example I know of.
In Lisp, I think I could implement this (at first) with a simple file that defines some structs that represent universities, eventually replacing that with some sort of macro that can fetch the university from a database or hash table lazily - I am, sadly, not well versed enough in lisp to understand how that would work, exactly, but I believe it could be done.
This would seem to me to give the best of both worlds. Since there is no difference in representation between code and data, no decision need be made about what is code and what is data; the different code-like and data-like aspects of the objects may be put into their proper storing places appropriately and easily (and that decision can be changed later without much fanfare).
I am interested in which other situations homoiconic languages would have obvious value. I believe that anything that involves diverse business rules would be a prime candidate - perhaps medical billing systems? - as ‘rules’ naturally fall on the line between code and data. Or, if you have an idea on how to address this situation in ruby/rails, I’d love to hear it! If you have an example, ping me on twitter - @jamescgibson.
Want to use the awesome pdf2htmlEX on Heroku?
You’re not alone.
For Quottly, we do quite a bit of PDF processing - turns out, a lot of colleges and universities like to publish information in PDF format.
We always try to use the pdf-reader ruby gem if we can, since it’s easy to deploy and maintain, but sometimes pdf-reader just doesn’t have enough power for what we’re trying to do.
We recently got pdf2htmlEX running on our Heroku app. Here’s how.
pdf2htmlEX is distributed either from source or as a Linux package.
To install the debian package for pdf2htmlEX on Heroku, we first added heroku-buildpack-apt to our application’s buildpacks.
Some old sources (including the README.md on heroku-buildpack-apt) will indicate that the best way to do this is to create a .buildpacks file in your project. However, Heroku now recommends adding the buildpacks from the command line, and/or using an app.json for reproducible deploys.
Then, add an Aptfile for heroku-buildpack-apt to pull from. Each line in the Aptfile is either the name of an apt package, in which case the package will be installed from the standard source archives available on Heroku, or is a link to a specific .deb package.
Either by running apt show on the pdf2htmlEX package, or by referencing this stack overflow post, you might come up with the following dependency list:
It’s worth noting that since listing the .deb on its own line installs it without automatically resolving dependencies, you will not receive a build error in the event that pdf2htmlEX installs but is unusable. The only way to confirm that pdf2htmlEX is installed correctly is to:
$ heroku run bash --app YOURAPP
$ pdf2htmlEX --version
and confirm that the output is correct.
After deploying with the Aptfile above, you likely will run into an error about a missing libpoppler57.so. I believe this is because the .deb file that is listed was built against a different libpoppler than the one that is installed here - in this case, libpoppler57 vs libpoppler46.
To fix, let’s just replace the libpoppler44 reference with an explicit reference to the correct .deb file - I found this by looking up libpoppler on the Ubuntu archive website:
This should resolve the libpoppler error. However, after deploying this, I still ran into the same problem listed on that stack overflow post -
pdf2htmlEX: /app/.apt/usr/lib/x86_64-linux-gnu/libstdc++.so.6: version `GLIBCXX_3.4.20' not found (required by pdf2htmlEX)
pdf2htmlEX: /app/.apt/usr/lib/x86_64-linux-gnu/libstdc++.so.6: version `GLIBCXX_3.4.21' not found (required by pdf2htmlEX)
pdf2htmlEX: /app/.apt/usr/lib/x86_64-linux-gnu/libstdc++.so.6: version `GLIBCXX_3.4.20' not found (required by /app/.apt/usr/lib/x86_64-linux-gnu/libpoppler.so.57)
The issue here is that the version of libstdc++6 being installed doesn’t include glibcxx_3.4.20 - we just need a newer version of libstdc++6. A quick upgrade:
A few caveats: I’m not entirely familiar with how linking on mirrors.kernel.org works, so I believe it is possible that these links may break some time in the future. Additionally, I would feel more comfortable if every one of the dependencies were locked down to a specific .deb - I’m concerned that a version bump on e.g. libgcc1 may break this build.
However, I think that it shouldn’t be too terribly difficult to cross that road if and when it arises - all that is needed to do is to determine which version of libgcc1 is installed on a working system, and then hard link to that `.deb.
Everyone - especially young people - could probably do well to invest more
of their income. One of the reasons that is is hard to get the motivation
to invest is that the outcomes are not clear, and the pace of progress is
oftentimes hard to gauge.
After all, if you check your portfolio regularly, you might see swings in
its value of hundreds of dollars every day, even though, on average, over
the course of years, you can be confident that your portfolio will become
And, what is the goal? Many of my friends who are the most thrifty in
general - who have to make the fewest life changes to begin investing
- are the ones who see the point the least. They don’t value material
things, but they do value experiences and freedom. So what is the point
of amassing a large portfolio?
Every $1000 you invest gives you one dime per day
That’s how I’ve started thinking about it. If I invest $1,000, I can then
(99 times out of 100) count on being able to spend one dime per day, in
current money, forever, without reducing my portfolio balance.
How? To withdraw one dime per day, my portfolio must on average earn
$36.50 per year - a return of just 3.65%.
With an aggressive but diversified portfolio of stocks and bonds and
reasonable inflation - say, 70% stocks earning on average 7.0%, 30% bonds
earning on average 3%, and 2% inflation, the math works out:
Your 70% allocation to stocks - or $700 - will on average earn 7.0%, or
($700 * .07) = $49.00 per year. Your $300 of bonds will earn on average
($300 * .03) = $9.00 per year, for a total of $58.00 per year.
You need to increase the balance of your portfolio by 2% - or $20 - each
year to counter act the effects of inflation, leaving you $38 per year to
$38 per year is just a hair over a dime per day, or $36.50 per year. So
thinking in terms of a dime per day is both easy and suitably
Of course, there’s more to this - if you actually live on a 3.65%
withdrawal rate, you’ll run out of money in 30 years about 1 out of every
100 times - but in general, it’s a safe way to think about things.
Backing things out
A dime per day isn’t much, or doesn’t seem like much. But thinking about
things starting with needs, not wants, makes it clear how much it can be.
What is the bare minimum to live? There are lots of things that make live
liveable - but ultimately the only thing you must have is food and
Buying in bulk, cornmeal is about $0.15 per 200 calories. Pasta is about
$0.20 per 200 calories. Rice and beans, which together provide all the
essential amino acids for dietary protein, if prepared yourself, are in
approximately the same range - say $0.15 per 200 calories. Canola oil,
a healthy fat, is less than $0.10 per 200 calories.
Net, you should be able to eat a relatively healthy 1600 calorie per day
diet (certainly not enough to thrive, but enough to live) for about $2.00
per day. Twenty dimes.
Hence, if you can get $20,000 into an investment account, you will never,
ever need to go hungry.
That’s a powerful statement. All that is required to never be hungry in
your life is saving $20,000 - quite a reasonable sum.
This extends to all the other aspects of life as well. In low income
neighborhoods of most non-major cities, you can find rooms for rent for as
little as $300/m, or 100 dimes per day, and in many cities you can
actually buy a (granted, low quality) house for as little as $50,000
- which, with a 30 year mortgage, will cost you about $300/m as well.
If you can get $100,000 invested, you will never be homeless.
$20,000 for food and $100,000 for housing - plus, say, another $0.05 per
day for odds and ends - and you can be assured that you will never be
destitute if you can get $125,000 together.
That’s powerful, because investing $125,000 is eminently doable. If you
have a college degree, you should be able to find a job that will pay at
least $30,000 per year - and if you simply continue to live as you did in
college, which was likely on less than $10,000 per year, you can save an
additional $15,000 per year and have $125,000 in just 7 years.
And every extra $1000 you save - which is only 50 hours of work at $20
/ hour, a pretty reasonable rate for freelance work - increases your
standard of living by $0.10 per day, forever.