WordCram gem for JRubyArt and propane
Wordle
Wordle is essentially a toy for generating “word clouds” from text that you provide. The clouds give greater prominence to words that appear more frequently in the source text. The Wordle applet was created by by Jonathan Feinberg (when at IBM hence Wordle has licence restrictions).
A key part of Wordle is the cue.language.jar (NB: this also used by the WordCram library). cue.language is a small library of Java code and resources that provides the following basic natural-language processing capabilities:
- Tokenizing natural language text into individual words
- Tokenizing natural language text into sentences
- Tokenizing natural language text into n-grams (sequences of 2 or more words that appear next to each other in a sentence)
- Counting strings
- Detecting which script (alphabet, writing system) is required to represent a text
- Guessing what language a text is in
- Customizable “stop word” detection for a variety of languages
The original by Jonathan Feinberg (when at IBM hence licence restrictions for cue.language.jar) see github.
WordCram
WordCram is essentially a re-implementation of Wordle by Dan Bernier using Processing
(and hence somewhat more customisable than Wordle
). It does the heavy lifting – text analysis, collision detection – for you, so you can focus on making your word clouds as beautiful, as revealing, or as silly as you like. It does use Jonathan Feinbergs cue.language.jar
to analyze the text.
Ruby WordCram
To make it even easier to to use the WordCram library in JRubyArt and propane we have created a rubygem wrapper. I had thought to recompile the libraries but currently I just extract the required jars from the WordCram distribution.
gem install ruby_wordcram
An example sketch
# This sketch shows how to make a WordCram from any webpage.
# It uses my blog
# Minya Nouvelle font available at http://www.1001fonts.com/font_details.html?font_id=59
require 'ruby_wordcram'
def settings
size 800, 400
end
def setup
sketch_title 'WordCram from Web Page'
color_mode(HSB)
background(255)
WordCram.new(self)
.from_web_page('https://ruby-processing.github.io/about/')
.with_font(create_font(data_path('MINYN___.TTF'), 1))
.with_colorer(Colorers.two_hues_random_sats_on_white(self))
.sized_by_weight(7, 100)
.draw_all
end
Output
More Sketches in ruby
Here we prefer File.readlines
to loadStrings
and use map to create an array of Word
, the only unfortunate thing is that we need to cast ruby array to java array.
require 'ruby_wordcram'
def settings
size(800, 600)
end
def setup
sketch_title 'US Male and Female First Names'
background 255
names = File.readlines(data_path('names.txt')).map do |line|
name, frequency, sex = line.split
col = 'f' == sex ? color('#f36d91') : color('#476dd5')
Word.new(name, frequency.to_f).set_color(col)
end
WordCram.new(self)
.from_words(names.to_java(Java::Wordcram::Word)) # cast to java array
.with_font(create_font(data_path('MINYN___.TTF'), 1))
.sized_by_weight(12, 60)
.draw_all
end
Reset version
require 'ruby_wordcram'
attr_reader :names, :wc, :reset
def settings
size(800, 600)
end
def setup
sketch_title 'US Male and Female First Names Reset'
@names = File.readlines(data_path('names.txt')).map do |line|
name, frequency, sex = line.split
col = 'f' == sex ? color('#f36d91') : color('#476dd5')
Word.new(name, frequency.to_f).set_color(col)
end
@reset = true
make_word_cram
end
def make_word_cram
# NB: see cast to java array
@wc = WordCram.new(self)
.from_words(names.to_java(Java::Wordcram::Word))
.with_font(create_font(data_path('MINYN___.TTF'), 1))
.sized_by_weight(12, 60)
end
def draw
background 255 if reset
@reset = false
if wc.has_more
wc.draw_next
else
puts 'done'
no_loop
end
end
def mouse_clicked
make_word_cram
@reset = true
loop
end