Overview
A word tree depicts multiple parallel sequences of words. It could be used to show which words most often follow or precede a target word (e.g., "Cats are...") or to show a hierarchy of terms (e.g., a decision tree).
Google word trees are able to process large amounts of text quickly. Modern systems should be able to handle novel-sized amounts of text without significant delay.
Note: The word tree is in beta and may be undergoing substantial revisions in future Google Charts releases.
Word trees are rendered in the browser using SVG, which means it will work in all modern browsers (e.g., Chrome, Firefox, Opera, and Internet Explorer 9+). Like all Google charts, word trees display tooltips when the user hovers over the data.
A simple example
Suppose you've collected a set of phrases about cats (e.g., "cats eat mice", "cats are better than kittens") and you want to highlight the most important attributes from the set.
This word tree depicts a tree of phrases, with the size of the words proportional to their usage. In this set of phrases, "cats eat mice" occurs four times, and "cats eat" occurs six times (four times with "mice", and twice with "kibble").
Try hovering over the words to see information about frequency.
Here's the web page that generates the above word tree:
<html> <head> <script type="text/javascript" src="https://www.gstatic.com/charts/loader.js"></script> <script type="text/javascript"> google.charts.load('current', {packages:['wordtree']}); google.charts.setOnLoadCallback(drawChart); function drawChart() { var data = google.visualization.arrayToDataTable( [ ['Phrases'], ['cats are better than dogs'], ['cats eat kibble'], ['cats are better than hamsters'], ['cats are awesome'], ['cats are people too'], ['cats eat mice'], ['cats meowing'], ['cats in the cradle'], ['cats eat mice'], ['cats in the cradle lyrics'], ['cats eat kibble'], ['cats for adoption'], ['cats are family'], ['cats eat mice'], ['cats are better than kittens'], ['cats are evil'], ['cats are weird'], ['cats eat mice'], ] ); var options = { wordtree: { format: 'implicit', word: 'cats' } }; var chart = new google.visualization.WordTree(document.getElementById('wordtree_basic')); chart.draw(data, options); } </script> </head> <body> <div id="wordtree_basic" style="width: 900px; height: 500px;"></div> </body> </html>
Word trees are case-sensitive. If you want "Cats" and "cats" to
be treated the same, use JavaScript's toLowerCase()
or toUpperCase()
methods on your text before providing it
to the word tree.
Implicit and explicit Word Trees
There are two ways to create word trees: implicitly and
explicitly. The choice is specified with
the wordtree.format
option.
format: 'implicit'
- The word tree will take a set of phrases, in any order, and construct the tree according to the frequency of the words and subphrases.
format: 'explicit'
- You tell the word tree what connects to what, how big to make each subphrase, and what colors to use.
The word tree in the previous section was an implicit Word Tree: we just specified an array of phrases, and the word tree figured out how big to make each word.
In an explicit word tree, the chart creator directly provides information about which words link to which, their color, and size.
There are several differences between the two word trees we've seen so far. The layout of the first word tree was calculated implicitly from a set of phrases, but in this word tree we've specified which words appear, where they appear, and how big they are.
This word tree is so wide that it's unlikely to fit onscreen. When that's the case, the word tree fades out at the edge. You can navigate the tree by clicking on any word.
function drawSimpleNodeChart() { var nodeListData = new google.visualization.arrayToDataTable([ ['id', 'childLabel', 'parent', 'size', { role: 'style' }], [0, 'Life', -1, 1, 'black'], [1, 'Archaea', 0, 1, 'black'], [2, 'Eukarya', 0, 5, 'black'], [3, 'Bacteria', 0, 1, 'black'], [4, 'Crenarchaeota', 1, 1, 'black'], [5, 'Euryarchaeota', 1, 1, 'black'], [6, 'Korarchaeota', 1, 1, 'black'], [7, 'Nanoarchaeota', 1, 1, 'black'], [8, 'Thaumarchaeota', 1, 1, 'black'], [9, 'Amoebae', 2, 1, 'black'], [10, 'Plants', 2, 1, 'black'], [11, 'Chromalveolata', 2, 1, 'black'], [12, 'Opisthokonta', 2, 5, 'black'], [13, 'Rhizaria', 2, 1, 'black'], [14, 'Excavata', 2, 1, 'black'], [15, 'Animalia', 12, 5, 'black'], [16, 'Fungi', 12, 2, 'black'], [17, 'Parazoa', 15, 2, 'black'], [18, 'Eumetazoa', 15, 5, 'black'], [19, 'Radiata', 18, 2, 'black'], [20, 'Bilateria', 18, 5, 'black'], [21, 'Orthonectida', 20, 2, 'black'], [22, 'Rhombozoa', 20, 2, 'black'], [23, 'Acoelomorpha', 20, 1, 'black'], [24, 'Deuterostomia', 20, 5, 'black'], [25, 'Chaetognatha', 20, 2, 'black'], [26, 'Protostomia', 20, 2, 'black'], [27, 'Chordata', 24, 5, 'black'], [28, 'Hemichordata', 24, 1, 'black'], [29, 'Echinodermata', 24, 1, 'black'], [30, 'Xenoturbellida', 24, 1, 'black'], [31, 'Vetulicolia', 24, 1, 'black']]); var options = { colors: ['black', 'black', 'black'], wordtree: { format: 'explicit', type: 'suffix' } };
<html> <head> <script type="text/javascript" src="https://www.gstatic.com/charts/loader.js"></script> <script type="text/javascript"> google.charts.load('current', {packages:['wordtree']}); google.charts.setOnLoadCallback(drawSimpleNodeChart); function drawSimpleNodeChart() { var nodeListData = new google.visualization.arrayToDataTable([ ['id', 'childLabel', 'parent', 'size', { role: 'style' }], [0, 'Life', -1, 1, 'black'], [1, 'Archaea', 0, 1, 'black'], [2, 'Eukarya', 0, 5, 'black'], [3, 'Bacteria', 0, 1, 'black'], [4, 'Crenarchaeota', 1, 1, 'black'], [5, 'Euryarchaeota', 1, 1, 'black'], [6, 'Korarchaeota', 1, 1, 'black'], [7, 'Nanoarchaeota', 1, 1, 'black'], [8, 'Thaumarchaeota', 1, 1, 'black'], [9, 'Amoebae', 2, 1, 'black'], [10, 'Plants', 2, 1, 'black'], [11, 'Chromalveolata', 2, 1, 'black'], [12, 'Opisthokonta', 2, 5, 'black'], [13, 'Rhizaria', 2, 1, 'black'], [14, 'Excavata', 2, 1, 'black'], [15, 'Animalia', 12, 5, 'black'], [16, 'Fungi', 12, 2, 'black'], [17, 'Parazoa', 15, 2, 'black'], [18, 'Eumetazoa', 15, 5, 'black'], [19, 'Radiata', 18, 2, 'black'], [20, 'Bilateria', 18, 5, 'black'], [21, 'Orthonectida', 20, 2, 'black'], [22, 'Rhombozoa', 20, 2, 'black'], [23, 'Acoelomorpha', 20, 1, 'black'], [24, 'Deuterostomia', 20, 5, 'black'], [25, 'Chaetognatha', 20, 2, 'black'], [26, 'Protostomia', 20, 2, 'black'], [27, 'Chordata', 24, 5, 'black'], [28, 'Hemichordata', 24, 1, 'black'], [29, 'Echinodermata', 24, 1, 'black'], [30, 'Xenoturbellida', 24, 1, 'black'], [31, 'Vetulicolia', 24, 1, 'black']]); var options = { colors: ['black', 'black', 'black'], wordtree: { format: 'explicit', type: 'suffix' } }; var wordtree = new google.visualization.WordTree(document.getElementById('wordtree_explicit')); wordtree.draw(nodeListData, options); } </script> </head> <body> <div id="wordtree_explicit" style="width: 900px; height: 500px;"></div> </body> </html>
In the above code, you can see that we construct our DataTable manually. We first declare our five columns:
- The index number (used to identify the parent of a word).
- The text to appear in the tree. (It doesn't have to be a word.)
- The parent of the word, with -1 meaning "no parent".
- The size of the word.
- The color of the word.
Then we add a row for each word. Here's an example:
nodeListData.addRow([9, 'Amoebae', 2, 1, 'black']);
This is row #9, adding the word Amoebae
to the word
tree. The parent is row 2 (Eukarya
), the size is 1 (in no
particular unit), and the color is 0. All of the colors in this word
tree are black, but the sizes are different.
Text size
In implicit word trees, the actual display size of each word is
affected by two things: the size specified for the word, and the size
specified for all the words below it (that is, to the right) in the
tree. In the above word tree, Life
has three
children: Archaea
(size 1), Eukarya
(size
5), and Bacteria
(size 1).
Because we haven't provided much vertical space for this word tree,
the 21 phyla of bacteria are likely to overflow the available
space. The word tree collapses them, so you see a tendril labelled "21
more". If you click on Bacteria
, the word tree will
recenter and you'll be able to see the 21 phyla. Clicking
on Bacteria
again will recenter "up" the tree.
If this automatic calculation of text size makes some words too
big, you can set an upper bound with the maxFontSize
option:
var options = { maxFontSize: 14, wordtree: { format: 'explicit', type: 'suffix' } };
<html> <head> <script type="text/javascript" src="https://www.gstatic.com/charts/loader.js"></script> <script type="text/javascript"> google.charts.load('current', {packages:['wordtree']}); google.charts.setOnLoadCallback(drawSimpleNodeChart); function drawSimpleNodeChart() { var nodeListData = new google.visualization.arrayToDataTable([ ['id', 'childLabel', 'parent', 'size', 'weight'], [0, 'Life', -1, 1, 0], [1, 'Archaea', 0, 1, 0], [2, 'Eukarya', 0, 5, 0], [3, 'Bacteria', 0, 1, 0], [4, 'Crenarchaeota', 1, 1, 0], [5, 'Euryarchaeota', 1, 1, 0], [6, 'Korarchaeota', 1, 1, 0], [7, 'Nanoarchaeota', 1, 1, 0], [8, 'Thaumarchaeota', 1, 1, 0], [9, 'Amoebae', 2, 1, 0], [10, 'Plants', 2, 1, 0], [11, 'Chromalveolata', 2, 1, 0], [12, 'Opisthokonta', 2, 5, 0], [13, 'Rhizaria', 2, 1, 0], [14, 'Excavata', 2, 1, 0], [15, 'Animalia', 12, 5, 0], [16, 'Fungi', 12, 2, 0], [17, 'Parazoa', 15, 2, 0], [18, 'Eumetazoa', 15, 5, 0], [19, 'Radiata', 18, 2, 0], [20, 'Bilateria', 18, 5, 0], [21, 'Orthonectida', 20, 2, 0], [22, 'Rhombozoa', 20, 2, 0], [23, 'Acoelomorpha', 20, 1, 0], [24, 'Deuterostomia', 20, 5, 0], [25, 'Chaetognatha', 20, 2, 0], [26, 'Protostomia', 20, 2, 0], [27, 'Chordata', 24, 5, 0], [28, 'Hemichordata', 24, 1, 0], [29, 'Echinodermata', 24, 1, 0], [30, 'Xenoturbellida', 24, 1, 0], [31, 'Vetulicolia', 24, 1, 0], [32, 'Actinobacteria', 3, 1, 0], [33, 'Firmicutes', 3, 1, 0], [34, 'Tenericutes', 3, 1, 0], [35, 'Aquificae', 3, 1, 0], [36, 'Deinococcus-Thermus', 3, 1, 0], [37, 'Fibrobacteres-Chlorobi/Bacteroidetes', 3, 1, 0], [38, 'Fusobacteria', 3, 1, 0], [39, 'Gemmatimonadetes', 3, 1, 0], [40, 'Nitrospirae', 3, 1, 0], [41, 'Planctomycetes-Verrucomicrobia/Chlamydiae', 3, 1, 0], [42, 'Proteobacteria', 3, 1, 0], [43, 'Spirochaetes', 3, 1, 0], [44, 'Synergistetes', 3, 1, 0], [45, 'Acidobacteria', 3, 1, 0], [46, 'Chloroflexi', 3, 1, 0], [47, 'Chrysiogenetes', 3, 1, 0], [48, 'Cyanobacteria', 3, 1, 0], [49, 'Deferribacteres', 3, 1, 0], [50, 'Dictyoglomi', 3, 1, 0], [51, 'Thermodesulfobacteria', 3, 1, 0], [52, 'Thermotogae', 3, 1, 0]]); var options = { maxFontSize: 14, wordtree: { format: 'explicit', type: 'suffix' } }; var wordtree = new google.visualization.WordTree( document.getElementById('wordtree_explicit_maxfontsize')); wordtree.draw(nodeListData, options); } </script> </head> <body> <div id="wordtree_explicit_maxfontsize" style="width: 900px; height: 500px;"></div> </body> </html>
Prefix, suffix, and double Word Trees
The word trees we've seen so far are all suffix word trees: the root word is on the left, and words immediately following the root are on the right. In a prefix word tree, the root is on the right, and in a double word tree, it's in the center. Here's Lincoln's Gettysburg address as a prefix word tree culminating in the word 'nation':
Here's the same speech as a suffix word tree, also rooted at 'nation':
You specify a suffix tree by providing type: 'suffix'
in the chart options:
var options = { wordtree: { format: 'implicit', type: 'suffix', word: 'nation' } };
A double word tree marries the prefix and suffix word trees:
You specify a double word tree by providing type: 'double'
in the chart options. Note that double word trees must always specify a root word, and should always be format: 'implicit'
.
var options = { wordtree: { format: 'implicit', type: 'double', word: 'nation' } };
The root of the tree is specified in the word
option,
so with a little HTML we can give users the ability to select the root
from their web page:
The full web page for this word tree:
<html> <head> <script type="text/javascript" src="https://www.gstatic.com/charts/loader.js"></script> <script type="text/javascript"> google.charts.load('current', {packages:['wordtree']}); google.charts.setOnLoadCallback(drawSimpleNodeChart); function drawSimpleNodeChart() { var data = google.visualization.arrayToDataTable( [ ['Phrases'], ['Four score and seven years ago our fathers brought forth on this continent, a new nation, conceived in Liberty, and dedicated to the proposition that all men are created equal. Now we are engaged in a great civil war, testing whether that nation, or any nation so conceived and so dedicated, can long endure. We are met on a great battle-field of that war. We have come to dedicate a portion of that field, as a final resting place for those who here gave their lives that that nation might live. It is altogether fitting and proper that we should do this. But, in a larger sense, we can not dedicate -- we can not consecrate -- we can not hallow -- this ground. The brave men, living and dead, who struggled here, have consecrated it, far above our poor power to add or detract. The world will little note, nor long remember what we say here, but it can never forget what they did here. It is for us the living, rather, to be dedicated here to the unfinished work which they who fought here have thus far so nobly advanced. It is rather for us to be here dedicated to the great task remaining before us -- that from these honored dead we take increased devotion to that cause for which they gave the last full measure of devotion -- that we here highly resolve that these dead shall not have died in vain -- that this nation, under God, shall have a new birth of freedom -- and that government of the people, by the people, for the people, shall not perish from the earth.'] ] ); var options = { wordtree: { format: 'implicit', type: 'double', word: 'nation' } }; var wordtree = new google.visualization.WordTree(document.getElementById('wordtree_double')); wordtree.draw(data, options); } </script> </head> <body> <div id="wordtree_double" style="width: 900px; height: 500px;"></div> </body> </html>
Styling Word Trees
You can control the typeface and colors of your word
trees. Typefaces are set with the fontName
option:
Here's the options stanza for the above chart:
var options = { format: 'implicit', word: 'cats', fontName: 'Times-Roman' }
Color is more subtle. Like size, it can optionally be used to indicate some attribute of the words in the tree. If we wanted to color the words in our "cats" word tree to indicate sentiment, we can supply that in our DataTable.
In the above word tree, we construct our DataTable as follows:
var data = new google.visualization.arrayToDataTable([ ['phrase', 'size', 'value'], ['cats are better than dogs', 1, 8], ['cats eat kibble', 1, 5], ...
We set the sizes of all our words to 1, but let color (labeled
arbitrarily as 'value'
in the above snippet) range from 0
("cats are evil") to 10 ("cats are awesome") to indicate
sentiment. Then in our options, we specify three colors: lowest,
neutral, and highest:
var options = { format: 'implicit', word: 'cats', colors: ['red', 'black', 'green'] };
Colors can also be specified explicitly. Here's a word tree that
shows potential moves for a chess game. In addition setting the
colors of the words to 'white'
and 'black'
,
the background color is set to hex value '#cba'
with
the backgroundColor
option:
The DataTable for this word tree is declared as follows:
function drawChart() { var data = new google.visualization.arrayToDataTable([ ['id', 'childLabel', 'parent', 'weight', { role: 'style' }], [0, 'PK4', -1, 1, 'white'], [1, 'PK4', 0, 1, 'black'], ...
Note that the column containing the colors is identified not by a
text label like 'parent'
or 'weight'
, but
the style role: { role: 'style' }
.
We set the background color in the options:
var options = { format: 'explicit', backgroundColor: '#cba' };
Tokenizing sentences
Implicit word trees are broken into sentences and words according
to simple rules, expressed as regular expressions. In rare cases you
might want to override the default behavior, and in those cases you
can use the wordSeparator
and sentenceSeparator
options.
If you're fluent in regular expressions, the defaults may make sense to you:
- sentenceSeparator:
\s*(.+?(?:[?!]+|$|\.(?=\s+[A-Z]|$)))\s*
- wordSeparator:
([!?,;:.&"-]+|\S*[A-Z]\.|\S*(?:[^!?,;:.\s&-]))
Note: Regex splitting is nonstandard in Internet Explorer 8 and may lead to unexpected results.
Loading
The google.charts.load
package name is "wordtree"
:
google.charts.load("current", {packages: ["wordtree"]});
The visualization's class name is google.visualization.WordTree
:
var visualization = new google.visualization.WordTree(container);
Data format
Rows: Each row in the DataTable represents text to be displayed. For implicit word trees, the text of all rows is combined and tokenized before being displayed.
Columns for Implicit Word Trees:
Column 0 | Column 1 | Column 2 | |
---|---|---|---|
Purpose: | Text | Size (optional) | Style (optional) |
Data Type: | string | number | string |
Role: | domain | data | data |
Columns for Explicit Word Trees:
Column 0 | Column 1 | Column 2 | Column 3 | Column 4 | |
---|---|---|---|---|---|
Purpose: | ID | Text | Parent | Size | Style |
Data Type: | number | string | number | number | string |
Role: | domain | data | data | data | data |
Configuration options
Name | |
---|---|
colors |
A list of three colors, specified either by English name or hex value. The colors for words will be taken from a spectrum that begins at the first color (the low value), moves through the middle color (neutral), and ends at the last color (high). Type: Array of strings
Default: default colors
|
forceIFrame |
Draws the chart inside an inline frame. (Note that on IE8, this option is ignored; all IE8 charts are drawn in i-frames.) Type: boolean
Default: false
|
fontName |
The word tree typeface. Type: string
Default: default
|
height |
Height of the chart, in pixels. Type: number
Default: height of the containing element
|
maxFontSize |
The upper limit for font size of displayed words. Type: number
Default: null
|
width |
Width of the chart, in pixels. Type: number
Default: width of the containing element
|
wordtree.format |
If Type: string
Default:
'implicit' |
wordtree.sentenceSeparator |
For implicit word trees, the regular expression to use to break the text into sentences.
The sentences are then broken into words using the Type: regex
Default:
\s*(.+?(?:[?!]+|$|\.(?=\s+[A-Z]|$)))\s* |
wordtree.type |
Whether the word tree is a prefix tree (root word on the right), a suffix tree (left), or double tree (middle). Type: string
Default: 'suffix'
|
wordtree.word |
For implicit word trees, which word to use as the root of the tree. (Note that word trees are case sensitive.) This option must be specified for double word trees. Type: string
Default: null
|
wordtree.wordSeparator |
For implicit word trees, the regular expression to use to break sentences into individual words to be displayed. Type: regex
Default:
/([!?,;:.&"-]+|\S*[A-Z]\.|\S*(?:[^!?,;:.\s&-])) |
Methods
Method | |
---|---|
draw(data, options) |
Draws the chart. The chart accepts further method calls only after the
Return Type: none
|
clearChart() |
Clears the chart, and releases all of its allocated resources. Return Type: none
|
Events
Name | |
---|---|
ready |
The chart is ready for external method calls. If you want to interact with the chart, and
call methods after you draw it, you should set up a listener for this event before you
call the Properties: none
|
select |
Fired when the user selects a word, either to "zoom" into or out of the tree. Properties: word, color, weight
|
Data policy
All code and data are processed and rendered in the browser. No data is sent to any server.