{"id":611,"date":"2014-11-10T23:50:47","date_gmt":"2014-11-10T18:50:47","guid":{"rendered":"http:\/\/blogs.nd.edu\/emorgan\/?p=611"},"modified":"2014-11-10T23:50:47","modified_gmt":"2014-11-10T18:50:47","slug":"r","status":"publish","type":"post","link":"https:\/\/sites.nd.edu\/emorgan\/2014\/11\/r\/","title":{"rendered":"My first R script, wordcloud.r"},"content":{"rendered":"<p>\nThis is my first R script, wordcloud.r:\n<\/p>\n<blockquote><p><code><\/p>\n<pre>\r\n#!\/usr\/bin\/env Rscript\r\n\r\n# wordcloud.r - output a wordcloud from a set of files in a given directory\r\n\r\n# Eric Lease Morgan &lt;eric_morgan@infomotions.com&gt;\r\n# November 8, 2014 - my first R script!\r\n\r\n\r\n# configure\r\nMAXWORDS    = 100\r\nRANDOMORDER = FALSE\r\nROTPER      = 0\r\n\r\n# require\r\nlibrary( NLP )\r\nlibrary( tm )\r\nlibrary( methods )\r\nlibrary( RColorBrewer )\r\nlibrary( wordcloud )\r\n\r\n# get input; needs error checking!\r\ninput &lt;- commandArgs( trailingOnly = TRUE )\r\n  \r\n# create and normalize corpus\r\ncorpus &lt;- VCorpus( DirSource( input[ 1 ] ) )\r\ncorpus &lt;- tm_map( corpus, content_transformer( tolower ) )\r\ncorpus &lt;- tm_map( corpus, removePunctuation )\r\ncorpus &lt;- tm_map( corpus, removeNumbers )\r\ncorpus &lt;- tm_map( corpus, removeWords, stopwords( \"english\" ) )\r\ncorpus &lt;- tm_map( corpus, stripWhitespace )\r\n\r\n# do the work\r\nwordcloud( corpus, max.words = MAXWORDS, random.order = RANDOMORDER, rot.per = ROTPER )\r\n\r\n# done\r\nquit()<\/pre>\n<p><\/code><\/p><\/blockquote>\n<p>\nGiven the path to a directory containing a set of plain text files, the script will generate a wordcloud.\n<\/p>\n<p>\nLike Python, R has a library well-suited for text mining &#8212; <a href=\"http:\/\/cran.r-project.org\/web\/packages\/tm\/\">tm<\/a>. Its approach to text mining (or natural language processing) is both similar and dissimilar to Python&#8217;s. They are similar in that they both hope to provide a means for analyzing large volumes of texts. It is similar in that they use different underlying data structures to get there. R might be more for analytic person. Think statistics. Python may be more for the &#8220;literal&#8221; person, all puns intended. I will see if I can exploit the advantages of both.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>This is my first R script, wordcloud.r: #!\/usr\/bin\/env Rscript # wordcloud.r &#8211; output a wordcloud from a set of files in a given directory # Eric Lease Morgan &lt;eric_morgan@infomotions.com&gt; # November 8, 2014 &#8211; my first R script! # configure MAXWORDS = 100 RANDOMORDER = FALSE ROTPER = 0 # require library( NLP ) library( [&hellip;]<\/p>\n","protected":false},"author":92,"featured_media":0,"comment_status":"closed","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[1],"tags":[],"class_list":["post-611","post","type-post","status-publish","format-standard","hentry","category-uncategorized"],"_links":{"self":[{"href":"https:\/\/sites.nd.edu\/emorgan\/wp-json\/wp\/v2\/posts\/611","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/sites.nd.edu\/emorgan\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/sites.nd.edu\/emorgan\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/sites.nd.edu\/emorgan\/wp-json\/wp\/v2\/users\/92"}],"replies":[{"embeddable":true,"href":"https:\/\/sites.nd.edu\/emorgan\/wp-json\/wp\/v2\/comments?post=611"}],"version-history":[{"count":2,"href":"https:\/\/sites.nd.edu\/emorgan\/wp-json\/wp\/v2\/posts\/611\/revisions"}],"predecessor-version":[{"id":613,"href":"https:\/\/sites.nd.edu\/emorgan\/wp-json\/wp\/v2\/posts\/611\/revisions\/613"}],"wp:attachment":[{"href":"https:\/\/sites.nd.edu\/emorgan\/wp-json\/wp\/v2\/media?parent=611"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/sites.nd.edu\/emorgan\/wp-json\/wp\/v2\/categories?post=611"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/sites.nd.edu\/emorgan\/wp-json\/wp\/v2\/tags?post=611"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}