Tag Archives: IRuby

DataFrame library for Ruby


☗ Mikon

top

Mikon is a flexible data structure for Ruby language, inspired by data.frame of R and Pandas of Python. Its goal is to make it easy to manipulate the real data, apply statistical function to it and visualize the result in Ruby language.

It is compatible with Nyaplot::DataFrame and Statsample::Vector, and most methods the both gem have can be applied to Mikon’s data structure.

Main Features:

Dependencies

Optional Dependencies

Installation

$ gem install mikon

If you fail to install nmatrix, try:

$ sudo apt-get install libatlas-base-dev
$ sudo apt-get --purge remove liblapack-dev liblapack3 liblapack3gf
$ gem install nmatrix -- --with-opt-include=/usr/include/atlas

More detailed instructions are available for Mac and Linux.

If you fail to install iruby, try this.

Examples

Notebooks created with IRuby:

Usage

Initializing DataFrame

require 'mikon'
df2 = Mikon::DataFrame.new([{a: 1, b: 2}, {a: 2, b: 3}, {a: 3, b: 4}])

init0

Mikon::DataFrame.new({a: [1,2,3,4], b: [2,3,4,5]}, index: [:a, :b, :c, :d])

init1

df = Mikon::DataFrame.from_csv("~/data.csv")

init2

Basic data manipulating

df[:value]

init2

df[10..20]

init2

df.head(2)

head

df.tail(2)

tail

Row-based data manipulating

df.select{value > 100}

select

df2.map{b+1}.name(:c)

map

foo = []
df.each{foo.push(2*a)}
p foo #-> [2,4,6]
df.insert_column(:new_value){value * 2}

insert_column

df.any?{value >= 100} #-> true
df.all?{valu > 1} #-> false

Column-based data manipulating

In most cases column-based manipulating is faster than Row-based.

df2[:b] - df2[:a]

column_base0

df.insert_column(:new_value, df[:value]*2)

column_base1

Plotting

df[:value].plot

hist

Plotting with Nyaplot

require 'nyaplot'
plot = Nyaplot::Plot.new
plot.add_with_df(df, :histogram, :value)
plot

hist

Statistical with Statsample

Mikon::Series is compatible with Statsample::Vector, so most methods of Statsample can be applied to Mikon::Series.

require 'statsample'

Statsample::Analysis.store(Statsample::Test::T) do
  t_2 = Statsample::Test.t_two_samples_independent(df1[:value], df1[:new_value])
  summary t_2
end

Statsample::Analysis.run_batch

statsample

License

MIT License

Acknowledgement

Ruby Association Grant 2014 has been earmarked for the development of Mikon.

Contributing

  1. Fork it ( http://github.com/domitry/mikon/fork )
  2. Create your feature branch (git checkout -b my-new-feature)
  3. Commit your changes (git commit -am 'Add some feature')
  4. Run tests by running rspec on /path_to_gem/mikon/
  5. Push to the branch (git push origin my-new-feature)
  6. Create new Pull Request