Obligations in the Brave New Internet

privacy

Twitter is excited with reports of Facebook’s recently-published paper. In short, researchers manipulated 680,000 feeds to change the users’ moods. You can read about it, depending on what kind of editorial slant you prefer, on engadget, slate, forbes, &c.

Feelings aside, there is a deep issue. Just what are our obligations to each other as experimenters?

Much of the complaints about the Facebook paper rest on the concept of informed consent, and whether Facebook’s user agreement constitutes such.

It is clear to me that it does not. But we kid ourselves by pretending we expect every business to gain informed consent from its customers before doing science.

Granted the usual business experimentation is somewhat different. If a grocery store experiments with food scents to drive pie sales, we do not expect the grocer to obtain informed consent. However, there is a presumption that the grocer will only ever test ‘positive’ options. We expect grocers to avoid ‘eau d’skunk’, which could cause negative feelings.

So, it is possible that informed consent is not required when all the treatments will cause positive feelings? But we allow other companies to experiment with negative feelings.

Though academic research on sensationalism in television news is scant, at least one paper, Explaining Effects of Sensationalism on Liking of Television News Stories, indicates that the extent to which stories are ‘liked’ is influenced by their sensationalism. It is not hard to imagine local news production managers choosing more sensationalist stories to drive viewership, causing negative feelings in viewers. However, we do not expect the newsman to gain informed consent.

Neither the grocer nor the newsman will publish his results in PNAS. Informed consent should be a requirement for publishing in a journal, but should not publishing your work, or publishing without peer review, exempt from informed consent? The experiment still took place.

Neither grocer nor newsman will conduct a proper scientific experiment, with simultaneous treatment of randomly selected treatment groups. Their findings will be conflated by dozens of temporal and treatment order variables. But should poor experimental design, i.e. poor science, exempt the grocer or newsman from informed consent?

There is clearly a continuum of experimentation occurring every day, from a waitperson trying out different greetings to get a bigger tip, to large scale medical experiments. At what point should we require what level of informed consent? Is Facebook’s experiment like that of the grocer, or like that of Zimbardo?

I do not have an answer to these questions, but I think that ‘Facebook did not obtain informed consent’ is not the full story.

Refactoring Ids_from_emails in Ruby

ruby

As part of the development of UpsideOS, I refactor old but functional code on a rolling basis. As a lone developer the trade-offs between ‘ship it’ and ‘do it right’ are even more pressing than ever, so there is plenty to refactor. Here is the thought process I went through while refactoring one of the functions in UpsideOs.

We let users add new users to a team by specifying their email address. We need the ids of the user records to create the association. Here is a static method that takes a mixed array of integer user ids and email addresses and returns an array of integer user ids. The User class is a Rails ActiveRecord class.

def self.ids_from_emails(array)
  if array
    new_array = []
    array.delete("")
    array.each do |id|
      if id.is_a? String and id.to_i == 0
        new_array.push User.find_by_email(id).id if User.find_by_email(id)
      else
        new_array.push id.to_i if id.to_i and id.to_i != 0
      end
    end
    new_array
  end
end

reek identifies two code smells:

ids_from_emails calls User.find_by_email(id) twice (DuplicateMethodCall)
ids_from_emails calls id.to_i 4 times (DuplicateMethodCall)

We can ‘fix’ these duplicate method call code smells using temporary variables:

def self.ids_from_emails(array)
  if array
    new_array = []
    array.delete("")
    array.each do |id|
      id_int = id.to_i
      if id.is_a? String and id_int == 0
        temporary_user = User.find_by_email(id)
        new_array.push temporary_user.id if temporary_user
      else
        new_array.push id_int if id_int and id_int != 0
      end
    end
    new_array
  end
end

This fixes the code smells in reek, but introduces a new TooManyStatements smell. The tests still pass. But the is not improved. The code smells pointed to deeper issues.

The method is written in a procedural, imperative style. It is 14 lines long. Moving towards a declarative style could improve it. The is_a? call in 6th line stands out; explicit type checking often indicates deep issues, though in a mixed array of integers and strings it may be unavoidable.

First, skip checking if array is nil by setting it to a default empty array. If set to [], the array.each method will not iterate over anything and the proper, empty new_array will be returned. Whether the method is ever called with nil as the argument is another problem, but outside the scope here.

def self.ids_from_emails(array)
  array = array || []
  new_array = []
  array.delete("")
  array.each do |id|
    #...
  end
  new_array
end

Creating a new_array and returning it explicitly is imperative as is calling .each. Since each returns the array it was called on, it can only do useful work by side effects. We can instead use map, which returns an array of block results.

However if the iterator block fails to find a record, nil will be added to the returned array. The array.delete("") line was also added to address this, so we can simplify things by just removing nil values before we return. Ruby’s array#compact does nicely:

def self.ids_from_emails(array)
  array = array || []
  array = array.map do |id|
    #...
  end
  array.compact
end

That inner block is now more than half of the method, suggesting we could refactor it out into a new method. Whether or not this makes sense depends on whether we expect this method to change much in the future, but we can try it:

def self.ids_from_emails(array)
  array = array || []
  array = array.map{|value| id_from_email_or_id(value)}.compact
end

def id_from_email_or_id(value)
  if value.is_a? String and value.to_i == 0
    User.find_by_email(value).id if User.find_by_email(value)
  else
    value.to_i if value.to_i and value.to_i != 0
  end
end

The tests still pass, but the newly created method is ugly. It still has a call to is_a?, which is sub-optimal. There are in fact three cases that have to be handled here: either an id is passed, in which case it must be converted to an integer, or an email is passed and the User is in the database, in which case we should look up their id and include it, or some string that is not an integer id or a valid email address is put in in which case we should return nil.

The sets of valid ids and valid email addresses are mutually exclusive. Hence:

def self.ids_from_emails(array)
  array = array || []
  array = array.map{|value| id_from_email_or_id(value)}.compact
end

def id_from_email_or_id(value)
  (User.find_by_email(value).id rescue nil) || (User.find(value).id rescue nil)
end

This final refactor has no code smells in reek and is closer to idiomatic ruby. The style is more declarative than imperative, and the result is easier to read. Using rescue so heavily may not be quite as efficient as the old imperative style, but until speed becomes an issue, the maintainability benefits are probably worth it. The method has additionally been reduced from 14 lines to 3 between two methods, an (admittedly subjective) aesthetic improvement.

Use Blank? To Avoid Code Smells in Rails

ruby

To improve my programming skills, I am refactoring some Ruby on Rails projects. I am using code smell tools like reek to do so.

One of the code smells reek checks for is NilCheck. Checking if an object is nil is a type of explicit type checking, indicating abstraction problems. If polymorphism is working then no explicit checks are necessary.

However in rails ActiveRecord objects - unless a default is set in the database - have their fields initialized to nil. But after the object has been saved from a form, strings fields will be empty ("").

How to condition on ‘presence of a string of length more than zero’? We can’t call length since nil.length throws an exception. And if we write string.nil? or string == "", we are making an explicit type check.

However, in rails, both nil and String respond to blank?. Therefore string_variable.blank? will fulfill the role of string_variable.nil? or string_variable == "" nicely, avoiding code smells and unnecessary database defaults.

Setting Up PhoneGap on Ubuntu

phonegap

I decided to revisit phonegap as an option for developing mobile appliations. Here are some gotchas I ran into while setting up PhoneGap on Ubuntu 14.04 LTS. This might be relevant for installing apache cordova stand-alone as well, though I have not tried this.

The install page assumes you know that you need to install the SDK for the platform you want to test on. You can download the android SDK here. The SDK’s sdk/tools/ and sdk/platform-tools/ directories have to be on the system’s PATH, which can be accomplished with:

$ export PATH="/path/to/sdk/tools/:$PATH"
$ export PATH="/path/to/sdk/platform-tools/:$PATH"

If you want to have this added to your $PATH automatically, add the above lines to `~/.bash_profile.

You also have to create an Android Virtual Device to have PhoneGap use. To do this, run the android script which is found in the sdk/tools directory. The default interface only shows you which platform SDKs you have installed, to create a virtual device, use the wizard accessed from the Tools > Manage AVDs… menu.

Be careful when allocating android virtual devices; if your system does not have as much free memory as the device image requires, the manager will automatically reduce the memory allocated and give you an appropriate warning.

However, I could not get a Nexus 4 or Nexus 5 image to boot after the system had automatically reduced the allocation. The loader will appear to make progress booting the image (with a “this may take a while” message), but would not load for me even after 45+ minutes. Choosing an image that had little enough memory that it was not automatically reduced during launch, or manually setting the memory to a suitable low value, prevented this.

First Post Using Jekyll

jekyll

A few weeks ago, one of the WordPress installations that lives on my personal Amazon EC2 micro server was hacked, and my server became a (very slow, ineffective) part of a DDoS botnet. I took the server down immediately, but decided that running WordPress for my simple site was silly.

This site (at least before the weeks of downtime) received no more than 2 or 3 visitors per day, maximum. The Micro sever is slow enough that I had to install page caching to get reasonable performance anyway. So why have a CPU idling 24x7 in the cloud just to fetch a bit of HTML out of memory three times a day?

So, the website you see now is created with octopress. I’m quite happy with it so far, though I did have a little bit of trouble modifying the slash theme to colors I prefer.

Other than zero sever maitenance, the real benefit of this so far is that I can store my blog in a git repository! Version controlling prose is one of the world’s great hacks in my opinion. Once one starts (in my case with LaTeX for academic work), it is hard to go back.

I will attempt to get some of the old posts back up with URL redirects at some point.