Update on parking & weather for #THATCampSoCal this weekend


Please note that our Parking page had an error, which has been corrected. Parking fees only apply on the Friday of THATCamp ($8.00 for a daily permit). There is NO charge for parking on Saturday.  Please check our Parking page for full details on where to park and where to obtain a visitor parking permit for the day.

Weather & Dress

Weather forecasts indicate that it’s still going to be pretty hot here in Fullerton this Friday and Saturday (mid-90s). Our venue contact has assured us that air conditioning will be on at a good strength both days in all of our venue rooms and spaces. If you’re the type that gets cold when A/C is on, please bring a sweater or sweatshirt.

Despite the promise of good A/C, you might want to dress for hot weather. THATCamps are casual unconferences. Please feel free to wear shorts. And in Southern California, flipflops and sandals are always appropriate dress code!

Permanent link to this article:


#THATCampSoCal Workshop Preview: Scalar

If you are attending our Workshop day this Friday, but haven’t heard of Scalar before, this video will give you a preview of what you’ll learn and work with in our Web Publishing with Scalar workshop!

Scalar Platform — Trailer from IML @ USC on Vimeo.

Permanent link to this article:

Permanent link to this article:


Text Mining Workshop

The text mining workshop will take place on Friday, 14 September from 1:30-3:30 in Pavilion C (Main Room).

Workshop Description

This workshop will introduce the basic concept of text mining: the discovery of knowledge through the analysis of digital texts using computational approaches. The workshop will cover the stages of text mining from preparing the texts, to performing analyses, to visualising the results. We will focus on two emerging methods of text mining that are easy for the novice to learn but sophisticated enough to produce real results.

Lexomics is a method for clustering texts or parts of texts based on their word frequencies. The technique allows users to examine similarities and differences between texts in way that can point to interpretive insights or directions of further enquiry into the style, authorship, and origin of the texts. Topic modelling is a technique for using word frequencies to extract individual units of discourse (called “topics”) from texts so that texts can be compared based on the presence of certain topics or the proportion of certain topics can be traced across a corpus over time (or other criteria).

There will be a hands-on component to the workshop to allow participants to learn the software tools for exploring these methods. We will also have discussion about the epistemological and hermeutic issues raised by the use of text mining approaches to the analysis of texts in the Humanities.

Advance Preparation for the Workshop

No prior experience with computational text analysis is necessary. The tools for performing lexomics analysis are web based, so you do not need to download them in advance. These tools may be found on the Lexomics web site:

There are many tools for performing topic modelling, but we will use the GUI Topic Modeling Tool which may be downloaded at Please download it in advance of the workshop. Note that in order to run the GUI Topic Modeling Tool, you will need to have Java installed on your computer. You can test whether Java is working and find out how to install it at

Please feel free to download the sample texts for use during the hands-on session.

Finally, please have a copy of Google Chrome or Firefox installed on your computer, as the lexomics tools have not been tested with Internet Explorer.

Background Reading:


For convenience, here are some basic commands for operating the command-line version of MALLET. The first command imports the data and the second generates the topics:


bin\mallet import-dir –input data –output filename.mallet –keep-sequence –remove-stopwords

bin\mallet train-topics –input filename.mallet –num-topics 20 –output-state topic-state.gz –output-topic-keys filename.txt –output-doc-topics filename_composition

Update: A fuller set of instructions for using MALLET can be found at


Still a challenge. I am working on a PHP-based topic browser that improves on the GUI Topic Modeling Tool output, but right now it only lives on my hard drive, so I can’t link to it. Elijah Meeks has made good use of Gephi, but it does not like my graphics card, so I haven’t tried it. It seems to be best suited to types of network analysis.

Right now, the easiest visualisation option seems to be opening CSV data for topic models in Excel and generating graphs there.

That said, I’m really impressed with Matt Jockers’ theme viewer, presented in anticipation of the publication of his book Macroanalysis: Digital Methods and Literary History (UIUC Press, 2013). It’s really just a combination of individually generated bar and line graphs, combined with word clouds, and stuck in a database, but it’s effective. Also worthy of mention is Elijah Meeks’ use of D3 to create a word cloud “topography”.

Workshop Presentation:

I’m going to re-work it into a blog post during the week after the conference. My blog is

Permanent link to this article:


#THATCampSoCal – Camper UPDATE

Greetings, Campers!

We are just over one week away from THATCamp SoCal 2012, which takes place Friday and Saturday September 14-15, 2012 at Cal State Fullerton!

It has been months since most of you registered. If you are now unable to attend, *please email back to advise of your cancellation*. We still have some latecomers (students) on a wait list, who would like to move into your spot if you can no longer attend. And I would also like to adjust the catering count, if you can no longer attend, so that our generous food sponsors are not spending extra money. We hope you get to catch another THATCamp!

If you still plan to attend, please make note of the following items prior to arriving on campus next Friday.

Please make sure you regularly check our blog for last minute changes and info all next week (!

  • Our Schedule has been posted online (
  • Most of our Friday Workshops have been finalized ( We still have room to accommodate extra workshops if there is a 1.5-2 hour topic you would like to teach.
  • Saturday’s unconference Sessions will get brainstormed and scheduled Friday afternoon at our 3:30pm Scheduling Session.



  • Campers should plan to bring a laptop or tablet (with chargers); our venue is *not* set in computer labs with desktop computers.
  • Most of Friday’s Workshops require you to bring a laptop for the actual hands-on work. If you do not have access to a laptop, you will still be able to sit in on workshops to observe and take notes. A tablet won’t be able to accommodate the software used in most of the Workshops.


  • A continental breakfast and coffee will be provided each morning — sponsored by CSUF’s Pollak Library, and CSUF’s College of Humanities & Social Science!
  • Subway Sandwiches will be provided for lunch on Friday — sponsored by the Occidental College Center for Digital Learning & Research.
  • Roundtable Pizza will be provided for lunch on Saturday — sponsored by the UCLA Center for Digital Humanities.
  • Extra mid-day coffee provided courtesy of a THATCamp grant from Microsoft (yes, Microsoft).
  • Want more options? Check our campus dining (on your own) at:
  • Join us Friday night for Happy Hour (on your own) across from campus at the Cantina Lounge across from campus.

Please feel free to contact me if you have additional questions or concerns.

We look forward to learning and networking with you next week at THATCamp SoCal!

Permanent link to this article:

Older posts «

» Newer posts

Skip to toolbar