A New St. Louis? Powering Smart Cities with HPC, Big Data and Cloud

Loop Trolley Model

A new initiative aims to install a high-speed fiber network along the proposed Loop Trolly route.

Entrepreneurs, Technologists, Business and Community Leaders, St. Louisans:

Could a new St. Louis be emerging? Starting with the Loop Media Hub initiative, St. Louis has the potential to transform into a “Smart City” and High Performance Computing (HPC), Big Data and Cloud computing will be essential enabling technologies.  Join us  on March 15 at the Missouri History Museum for a free STLhpc.net event to get involved and to hear more about the initiative, the opportunities it is creating and how HPC, Big Data and Cloud computing apply.  You’ll receive details about the recently release Request for Information (RFI) that will give you the chance to contribute to and influence this transformation.


  • Frank Lee, Ph.D. – IBM Senior IT Architect
    • Substituting for Mark Dixon – IBM Client Technical Architect – Infrastructure
  • John Leach – CEO at Incite Retail
  • Gary Stiehr – Founder, STLhpc.net
  • Brad Molander – Technology Evangelist at NISC


  1. Networking (5:00 PM to 5:30 PM)
  2. An Overview of BigData Analytics & Cloud in Smarter Cities (Mark Dixon Frank Lee)
  3. Leveraging Distributed Computing to Enhance Customer and Visitor Relationship Management (John Leach)
  4. Smart Cities: Where Transportation Research Meets High Performance Computing (Gary Stiehr)
  5. Applying Distributed Computing to SmartGrid (Brad Molander)
    • Electrical grids across the nation have recently been deploying modernized assets such as smart meters that produce enormous amounts of data. NISC over the last two years has designed, engineered and deployed an internal private cloud environment to support the modern data needs of SmartGrid initiatives. This presentation will cover some some fundamentals of smart grid as well as illustrate the challenges and benefits of private cloud environments.
  6. How You Can Get Involved – The Loop Media Hub RFI

Further Information:
A new tourist trolly, known as the Loop Trolley, is planned to run between Forest Park and the Delmar Loop here in St. Louis.  An emerging plan, the Loop Media Hub initiative, aims to take advantage of the construction to install fiber optic cabling along that route to enable a high-speed network for the region.  But that’s just the foundation for the overall effort, which could be the beginning of a new St. Louis.

St. Louis is Transforming…

Starting with the Loop Media Hub initiative, St. Louis has the potential to transform into a “Smart City” and High Performance Computing (HPC), Big Data and Cloud computing will be essential enabling technologies.  A Smart City is a community which seeks to accelerate the economic development of its metropolitan Internet and the productivity and quality of life of its businesses and residents, through the thoughtful alignment of related organizations, resources and infrastructure.

St. Louis is Connecting…

The City of University City, Washington University, The Center of Creative Arts, The Regional Arts Council, The St. Louis Development Corporation and The University City Chamber of Commerce are considering a plan for the development of a “one Gigabit Internet ecosystem for the community along The Loop Trolley right of way”. The ecosystem plan will include provisions for ultra-gigabit connectivity, wireless services, smart phone apps and directory services, entrepreneurial incentives, economic development incentives, creative innovations and energy policy. The Loop Media Hub initiative makes use of a collaborative community engagement process through which the actual plan will be developed.

St. Louis is Taking Action…

Join us for an informative session about the Loop Media Hub initiatiave and the possibilities for St. Louis as a Smart+Connected City and how we can apply HPC, Big Data and Cloud computing to the Loop Media Hub initiative in order to empower businesses, agencies and residents to transform St. Louis. We’ll intermix brief discussion sessions where entrepreneurs, business and community leaders, technologists and other attendees can share ideas and perhaps form collaborations to generate new responses to the recently released Loop Media Hub RFI.  This is a great opportunityfor new and existing start-up ventures, universities and other local organizations to get involved in a collaborative project.

Posted in Events | Tagged , , | Comments Off

The Rise of Big Data – March 2, 2012 in St. Louis

On March 2, 2012, the St. Louis chapter of The Data Warehousing Institute™ (TDWI) will host “The Rise of Big Data; Case Studies: Putting Big Data to Work – A Hadoop and Hbase Case Study and Process Mining of Clinical Workflows Case Study.”  This event is free of charge and open to all interested BI/DW professionals. Breakfast will be provided.

St. Louis-based Incite Retail LLC and Mercy St. Louis will be sharing their Big Data expertise through a morning of presentations.  To attend, please RSVP at TDWI’s registration page.


8:00 – 8:30 a.m. Registration, Continental Breakfast and Networking
8:30 – 9:00 a.m. Opening Remarks 

M. C. Sankar, Chapter President

9:00 – 10:00 a.m. Session 1: 

” The Rise of Big Data ”

By John Leach

Founder of Incite Retail LLC


Big Data is being discussed in many organizations, and it is getting a lot of attention. We are at the beginning of a “Big Data” technology wave that seems to mirror the onset of e-commerce in its power to disrupt and change competitive landscapes. While we have seen the wave traverse from search engines to social media as it picks up strength, what do the internals of that wave look like, and how does an organization practically harness that power? This presentation will cover the following:

* What is Big Data?

* The progression of the Big Data wave

* Current vendors and approaches

* How does Distributed Computing work

10:00 – 10:30 a.m. Networking Break
10:30 – 11:30 a.m. Session2: 

” Case Study 1: Process Mining of Clinical Workflows Case Study”

By Vidyalakshmi Iyer and Anil Kabra

Mercy Health System

” Case Study 2: Putting Big Data to Work: A Hadoop and Hbase Case Study”

By John Leach

Founder of Incite Retail LLC


Case Study 1: Mercy is on a mission to improve clinical workflows and has found a way to leverage system access logs and “big data” technology to analyze how care givers do their work. Experts from Mercy’s architecture team will provide an overview of the process and technology used to extract and manipulate the access logs information, currently at more than 20 billion rows and 4 TB, as well as the analytical techniques used to gain insight about operational workflow patterns.

Case Study 2: Traditional databases and architectures are not sufficient to harness the power of Big Data. In this new digital age, the sheer volume, velocity, and variety of data requires new class of technologies and approach: Learn through a case study how Apache™ Hadoop™ and Hbase, an open-source framework, can be used to manage “Big Data.”

11:30 a.m. – 12:00 p.m. Raffles and Closing Remarks


See The Rise of Big Data for full details.  To attend, please RSVP at TDWI’s registration page.


Posted in Events | Tagged , , | Comments Off

St. Louis HPC Expertise Enables Discovery of Mutations Tied to Aggressive Childhood Brain Tumors

St. Jude Children’s Research Hospital – Washington University Pediatric Cancer Genome ProjectAs the latest result from the St. Jude Children’s Research Hospital – Washington University Pediatric Cancer Genome Project, researchers studying a rare, lethal childhood tumor of the brainstem discovered that nearly 80 percent of the tumors have mutations in genes not previously tied to cancer. Early evidence suggests the alterations play a unique role in other aggressive pediatric brain tumors as well.

“We are hopeful that identifying these mutations will lead us to new selective therapeutic targets, which are particularly important since this tumor cannot be treated surgically and still lacks effective therapies.”
– Suzanne Baker, Ph.D., co-leader of the St. Jude Neurobiology and Brain Tumor Program and a member of the St. Jude Department of Developmental Neurobiology

In order to identify these mutations, it is first necessary to process the data produced by DNA sequencers using a number of bioinformatics tools.  To complete in a timely fashion, this analysis must be done in parallel on multiple computers.  For the Pediatric Cancer Genome Project, the High Performance Computing (HPC) expertise and resources at The Genome Institute at Washington University in St. Louis were leveraged to enable this analysis.  Not only is the analysis of a single case analyzed in parallel but then multiple separate cases are analyzed simultaneously.

Analyzing what amounts to several hundred billion data points at any given time requires managing a cluster of computational systems as well as high performance storage and network subsystems.  Beyond that, a robust software framework helps to manage data sets and ensure fault tolerance.  This is another example of where using High Performance Computing is key to the advancement of organizations grappling with the analysis of large numbers of data points and complex analyses.

For more details on the findings, see the press release.  Follow STLhpc.net on LinkedIn, Twitter and/or Facebook to keep up to date on ways in which your organization can leverage HPC to advance your work.

About the Project:
St. Jude Children’s Research Hospital and Washington University School of Medicine in St. Louis, announced on January 25, 2010 an unprecedented effort to identify the genetic changes that give rise to some of the world’s deadliest childhood cancers. The team has joined forces to decode the genomes of more than 600 childhood cancer patients, who have contributed tumor samples for this historic effort.

Posted in Uncategorized | Comments Off

St. Louis Cited as Regional Hub Advancing HPC, Cloud & Data-Intensive Computing

HPC in the Cloud, an industry leading online publication dedicated to covering high-end cloud computing in science, industry and the datacenter, has described St. Louis as “a regional hub for the public and private sector advancement of high performance and cloud computing capabilities for genomics and other data-intensive industries” as it recognized the growth and momentum built up in 2011 by St. Louis-based Appistry.  A sell-out crowd at the STLhpc.net High Performance Data Analytics event in St. Louis heard from Appistry about the high performance computing (HPC) technologies that they employ to help tackle data-intensive problems in big-data industries, such as intelligence, defense, life sciences, financial services, and transportation.

As mentioned in the article, St. Louis is also home to a number of organizations advancing genetics research, such as The Genome Institute at Washington University, the University of Missouri–St. Louis, Monsanto and the Danforth Center.  Much of this advancement is thanks to computationally-intense, data-driven research projects.  With this research happening in more and more organizations around St. Louis, it is a great environment as the HPC community in St. Louis continues to grow and connect.  These connections within the community are driving  businesses and technologists in St. Louis to learn from and teach each other about HPC technologies to help advance the region as a whole.  This advancement comes not only in the area of high performance computing but also in the research areas that it enables.  To see examples of this, just look around STLhpc.net or take a look a these selected advancements:

Posted in Uncategorized | Comments Off

J.P. Morgan Deploys Supercomputer for Fixed Income Trading

HPCwire reports that J.P. Morgan has deployed a supercomputer from Maxeler Technologies for fixed income trading operations: “Maxeler’s approach to supercomputing will enable J.P. Morgan to assess tens of thousands of possible market scenarios, constantly examining the time path and structure of the associated risk. This means that complex scenarios can now be calculated in a few minutes rather than hours.”

Could St. Louis companies like Edward Jones, Scottrade, Stifel Nicolaus and so on utilize such technology to more quickly assess their investments or to pass on insights to their customers/clients?

J.P. Morgan’s next system from Maxeler will be equivalent to approximately 12,000 conventional processor cores but using only about 4% of the space and power consumption.  To enable this, Maxeler utilizes FPGA (Field-Programmable Gate Array) technology in their systems.

It happens that St. Louis has tremendous expertise in FPGA technologies.  For example, St. Louis-based Exegy provides an FPGA-based ticker plant for processing market data.  St. Louis-based Global Velocity produces an FPGA-based network security appliance.  St. Louis-based Velocidata is working to produce FPGA-based text mining and data integration appliances.  Further, Washington University in St. Louis provides leadership in FPGA research through their Department of Computer Science and Engineering.

St. Louis’ strength in the financial services sector could be bolstered by the use of High Performance Computing (HPC) technologies.  Further, its expertise in FPGA technologies and High Performance Computing in general allows for the development of innovative new products within the financial services market.

Posted in Uncategorized | Comments Off

St. Louis’ Cardinal Glennon Participates in Personalized Pediatric Cancer Research using High Performance Computing

SSM Cardinal Glennon Children’s Medical Center in St. Louis and Children’s Mercy Hospital and Clinics in Kansas City, MO are two of the hospitals that will participate in a clinical research trial that will leverage cloud-based HPC resources donated by Dell to help researchers study pediatric cancer.  According to InformationWeek’s article Dell Donates Cloud Power For Pediatric Cancer Research, the cloud infrastructure will be placed at the Translational Genomics Research Institute (TGen):

Depiction of DNA showing a Single Nucleotide Polymorphism (SNP)

Depiction of DNA showing a Single Nucleotide Polymorphism (SNP), one of the types of variants accounted for in personalized medicine. Image Source: Wikipedia (click image for details)

The donation of Dell’s secure, high-performance cloud-based computing resources will increase by 1,200 percent the gene sequencing and analysis capacity of TGen’s existing clinical cluster.

Currently, genomic sequencing of individual patients can take months, generating more than 4 terabytes of data, said Coffin in an interview with InformationWeek Healthcare. With the donated cloud technologies, the time needed for the genomic mapping and analysis of tumors can be shortened to weeks, he said.

The goal of the project is to provide oncologists and cancer researchers with the computational resources to do complex analysis on patient genomics and treatments faster so that personalized therapies can be better targeted to individual patients in the hopes of improved outcomes, and ultimately saving kids’ lives.

InformationWeek’s article Dell Donates Cloud Power For Pediatric Cancer Research has more details.

Here are some other examples of High Performance Computing playing a role in cancer research and personalized medicine:

Posted in HPC | Tagged , , , , | Comments Off

High Performance Computing and Sports Analytics

With the 2011 World Series about to kick off in St. Louis on Wednesday, it is interesting to note how the usage of analytics has been common in baseball for years.  However, not unlike other fields, the amount of data has begun to grow exponentially.  From the article Beyond ‘Moneyball’: The rapidly evolving world of sports analytics, Part I, we hear that the data available to decision-makers has grown exponentially over the last 15 years:

Innovations in sports science, ranging from training routines to nutritional regimens, coupled with improved reporting from medical staffs and trainers have all come with their own data sets that are gathered and tracked somewhere within an organization. With improved communications via the Internet, the frequency and amount of information captured, stored and distributed by scouts and coaches at all levels has grown significantly.

But it doesn’t stop there:

The advent of motion capture technology has expanded the data collected from each game. This technology tracks everything that moves on a field every 100th of a second. The impact of this is staggering for it transforms the amount of information captured for a single game from a few hundred rows of data to well over one million. Major League Baseball, the NBA and pro soccer teams have implemented this type of technology.

This all sounds very similar to what we heard on October 5, 2011 at the STLhpc.net High Performance Data Analytics event when we saw how High Performance Computing (HPC) is helping other organizations to tackle the analysis of ever growing quantities of increasingly complex data.  HPC could similarly help professional sports organizations leverage their wealth of data to make more informed decisions.

I recommend that you read the entire article Beyond ‘Moneyball’: The rapidly evolving world of sports analytics, Part I to hear more about how analytics have and will be used in sports.


Posted in HPC | Tagged , , | 1 Comment

St. Louis Hadoop Users Group to Give Free Hands-on Hadoop Intro on October 18, 2011

St. Louis Hadoop Users Group logo

St. Louis Hadoop Users Group

The STLhpc.net High Performance Data Analytics event held on October 5, 2011 was a success.  Cloudera, St. Louis-based Appistry and EMC/Greenplum presented about how to to enable and leverage a high performance data analytics environment.  With over 50 people in attendance from over 30 organizations, there was a lot of interest in taking advantage of St. Louis’ strengths in High Performance Computing (HPC) to power the growing trend of analyzing larger, more complex data sets.  The next opportunity to put that interest into action is during the St. Louis Hadoop Users Group’s October 18, 2011 meetup using HPC resources provided by R Systems, Inc. and a Hadoop instance configured by Incite Retail, a St. Louis-based company.

At the St. Louis Hadoop Users Group’s October 18, 2011 meetup, the St. Louis HPC community will be able to get some free hands-on Hadoop training, including:

  • HDFS File System Operations (Put, Get, etc.)
  • Simple Map / Reduce Program
  • HBase (Create table, Scan, etc.)
  • Hive (Create table, Simple Aggregate Queries)
  • Hadoop Q and A

During the training, we will use a seven-node cluster with approximately 32 TB of storage customized on-demand for this event by R Systems, Inc., who has pledged to support the St. Louis HPC community by providing St. Louis HPC members complimentary access (subject to availability) to custom cluster configurations.  Hadoop was installed and configured on that cluster by Incite Retail, a St. Louis-based company providing HBase and Hadoop Consulting as well as general expertise to help clients understand distributed databases.

Please take advantage of this free event that will help you to learn how to leverage High Performance Computing to perform high performance data analytics on ever growing quantities of increasingly complex data.

Posted in Events, HPC | Tagged , , , , , , | Comments Off

Analytics Experts Meet in St. Louis to Discuss High Performance Data Analytics

High Performance Data Analytics

Join us at this free STLhpc.net event on October 5th in St. Louis where analytics experts from Cloudera, St. Louis-based Appistry and EMC/Greenplum will join together to present for business executives, software developers and system administrators how to enable and leverage a high performance data analytics environment. Organizations in a growing number of industries including retail, health care, bioinformatics, financial services, scientific research, transportation and others are finding competitive advantages by increasing their ability to transform unprecedented amounts of available data into business insights and discoveries.

If you are just starting to investigate high performance data analytics, however, there can be a lot to know and a lot of options. Hadoop, HBase, Pig, Hive, HDFS, Appistry’s Ayrris, EMC’s Greenplum, Cloudera’s CDH, Map Reduce, NoSQL, and so on are just some of the methods and technology options available to help you leverage cost-effective compute and storage for your high performance data analytics. The presenters will provide clarification about where these components fit, how they relate to one another, how different industries are leveraging these tools and how you can get started using them as well.


  • Subramanian Kartik – Global Field CTO for the Data Computing Division at EMC
  • Michael Groner – Chief Architect at Appistry
  • Sultan Meghji – Vice President of Analytics Applications at Appistry
  • Michael Katzenellenbogen – Solutions Architect at Cloudera
  • Gary Stiehr – Founder, STLhpc.net; Information Systems Group Leader at The Genome Institute at Washington University in Saint Louis

Continue reading

Posted in Events | Tagged | 2 Comments

St. Louis Team Using HPC in Efforts to Develop Personalized Breast Cancer Vaccines

Depiction of DNA showing a Single Nucleotide Polymorphism (SNP)

Depiction of DNA containing a single-nucleotide polymorphism (SNP), one of the types of variants accounted for in personalized medicine. Image Source: Wikipedia (click image for details)

A team of clinicians and researchers at Washington University in St. Louis hope to develop sequencing-based personalized breast cancer vaccines.  According to a summary from GenomeWeb’s Clinical Sequencing News, “The team plans to use sequencing to create a polyepitope DNA vaccine that will contain ‘fragments from each individual mutated protein’ for individual patients. They hope to begin human trials for the vaccine within the next two years.”

As described in HPC Enables Genome Analysis Used to Alter Course of Patient’s Cancer Treatment, the analysis of DNA sequencing data for cancer research requires quite a bit of computational capacity.  As attempts are made to develop personalized treatments, the techniques used for producing the initial data and the algorithms and workflow for analyzing that data will continue to evolve as the number of patients increases.   To enable that process, High Performance Computing (HPC) resources at Washington University in St. Louis are being utilized to provide the required computational, storage and networking horsepower.  It is likely that hundreds of thousands of CPU hours and hundreds of terabytes of stored+intermediate data will be involved in these analyses–although that is a moving target since the algorithms and data representation models continue to evolve as do considerations of the sequencing approach to be used (e.g., whole genome sequencing vs. targeted and exome sequencing).

Posted in HPC | Tagged , , , | Comments Off