Mike Lamoureux's Blog

Work: I am a health information technology architect, management consultant, and Adjunct Professor located in the Washington, DC Area.

Play: Red Sox and Patriots fan, dog owner, technologist, coffee addict

View my Online Profile / Resume

Find me on...

Twitter feed

Search

Tag Results

8 posts tagged semantic web

Defining the Problem - Octo Consulting’s entree into Semantic Web

Note: This blog post was co-posted at Octoconsulting.com, with the help of our Director of Health IT, Dr. Charlie Mead.

Octo set out a few weeks ago to figure out a path to solve a problem that is simple to explain but difficult to achieve. We call it a “End-to-End Use Case.” The challenge is to find a way to discover “semantically equivalent” data that has been collected in multiple studies. Traditionally, this is a relatively straight-forward task if done pre-study (although it requires considerable “top-down” governance), but very difficult once a study has been “designed” and executed. The inherent semantic difficulty of the task is made even more difficult by ‘non-semantic’ barriers including different physical data persistence and access models, and the wire-format exchange” serialization brittleness” of XML, the lingua franca for much clinical trial data exchange. Our work is based on the overarching thesis that “end-to-end Semantic Web-based representation of study meta-data and data plus data transport formats largely circumvents the non-semantic barriers, thereby allowing study stakeholders to focus on the core problem: interoperable semantics.

Octo Consulting and members of the W3C’s Healthcare Life Sciences Working Group have developed a concrete instance of our hypothesis in which study meta-data and data are represented using based on RDFS and OWL ontologies of the HL7 Model Interchange Format (MIF) — the MIF includes the HL7 Reference Information Model (RIM), data types and vocabulary bindings — and SNOMED-CT. Data transport will use an Semantic Web representation of the CDISC ODM (Operational Data Model) specification. SPARQL end points are used for data discover and analysis.

Octo Semantic Web Hackathon in partnership with the W3C Life Sciences

Note: This blog article was co-posted at Octoconsulting.com

Octo is thrilled to announce a Semantic Web three day “hackathon”,  (February 19th - 22nd),  featuring guest Eric Prud’hommeaux from the World Wide Web Consortium (W3C) . Eric’s role in the Semantic Web Health Care and Life Sciences Interest Group (HCLS IG) is why we called upon him. While Semantic Web is a non-domain specific technology, we are deeply interested in how Semantic Web technologies can assist in the areas of healthcare and health research. We are striving to accomplish a number of objectives this week, some of which include:

  • Familiarize our consultants and technical staff on our use case - Developing an “End to End” solution for health information trails (See Dr. Mead’s upcoming blog post on “Defining the Problem”)
  • Develop a robust talent in Semantic Web for Octo’s brightest solutions team of Architects, Developers, and Management Consultants
  • Familiarize Octo with many of the open source tools in the market and build our own environment for Semantic Web tools Research & Development in our “Octo Labs” cloud environment
  • Develop additional materials for technical training sessions on Semantic Web technologies

Look for blog posts this week from our Octo team on the ground, featuring our perspective on interesting topics such as, how Semantic Web and Legacy IT are integrated,an update of the current progress of healthcare interoperability standards in  Semantic Web, a recap on our daily progress at the “hackathon’, business intelligence and visualization, and much more. To get in touch, tweet us at @mlamoure or @octoconsulting with questions, comments, and thoughts. Wish us luck!

Finished the graph for my Social Connections this weekend, so far Facebook is the only data point.  See my link to the dashboard here.  Took much longer than I anticipated as I had to change SPARQL endpoints a few times to get what I wanted.  A few of the SPARQL endpoints out there have limited or no support for BIND commands, and/or no support for SPARQL INSERT/UPDATE commands.  I’m now using Jena’s SPARQL endpoint.  Eventually I’ll expose the endpoint to the web if anyone is interested in the raw data.  Right now it’s firewalled.

Now that I have the social interactions proof of concept complete, I will move to getting more interesting data out of Facebook and my other social networks.  And then, of course, since I’m graphing them over time, it will take a bit to get the historical data to make it interesting.

EasyRDF has also been a great help, and I’m hoping the author continues to develop the library.

Progress on Semantic Web Dashboard

image

I am in the process of adding another source of data to my Semantic Web Dashboard.  I want to chart my number of friends via Social Media over time.  I started with Facebook - but Facebook doesn’t give me access to my historical friends count, only the current count today.  So, I’m writing a app that will record the number of friends I have on a daily basis.  I plan to expand it down the road to include number of postings and other things.  This will mean that any chart I create will have limited data initially.  I didn’t get to actually creating the chart yet, it took me long enough to figure out Facebook’s Graph API and to get that working properly (did I mention that I hate JSON?).

Here is the source to my PHP script that grabs my friend count:

<?php 

//require_once(“facebook-php-sdk-master/src/facebook.php”);

    $user_id = “7412441”;

$app_id = “REMOVED”;

$app_secret = “REMOVED”;

$my_url = “REMOVED”;

$app_token_url = “https://graph.facebook.com/oauth/access_token?”

. “client_id=” . $app_id

. “&client_secret=” . $app_secret 

. “&grant_type=client_credentials”;

 

$response = file_get_contents($app_token_url);

$params = null;

    parse_str($response, $params);

$graph_url = “https://graph.facebook.com/app?access_token=” 

        . $params[‘access_token’];

    //$app_details = json_decode(file_get_contents($graph_url), true);

    //$query_url = “https://graph.facebook.com//fql?q=SELECT+friend_count+FROM+user+WHERE+uid=” . $user_id . “&access_token=” . $params[‘access_token’];

$query_url = “https://graph.facebook.com/$user_id?fields=friends&access_token=” . $params[‘access_token’];

    $rawdata = file_get_contents($query_url);

    //echo $rawdata;

$friends = json_decode($rawdata, true);

$friends_count = count($friends[‘friends’][‘data’]);

echo $friends_count;

// Write to RDF, not complete yet.

$RDFData = “data.ttl”;

$fh = fopen($RDFData, ‘a’);

$stringData = “”;

?>

I’m very excited that I have my first chart / widget being produced off Semantic Web technologies. The link (click the header bar with the link icon) takes you to a server I have running.

Here’s how it works: the index.html file (code is below) makes a ajax call to a PHP file called “getEnergyData.php” and uses that data to chart the information in a Google Cart.  The PHP script’s ultimate output is a data format called JSON.  The PHP script uses a library I found called easyRDF, which has been very helpful and saved me lots of time.  easyRDF makes a call to my SPARQL server, which is hosting some simple data about my energy usage in 2011 (I simply haven’t entered my 2012 data yet).

The only change from my original plan is that I’m running SWObjects rather than Apache Jena Fuseki as my SPARQL Server.  SWObjects is cleaner and simpler, I didn’t need anything fancy for what I was doing.

What I learned: Moving between RDF and JSON isn’t as simple as I thought.  I’m embarrassed in the poor coding quality of my getEnergyData.php script (which is why I’m not sharing it yet).  JSON has the benefit of having many developer libraries that work well with it.  Manipulating RDF data (as to keep the context of the data) until you wish to output in JSON is the ultimate goal.  Currently I’m using PHP to iterate through records and script JSON, not the best tactic I’m sure.

Here is the index.html code:

<html>

  <head>

    <script type=”text/javascript” src=”https://www.google.com/jsapi”></script>

    <script type=”text/javascript” src=”http://code.jquery.com/jquery-1.8.3.js”></script>

    <script type=”text/javascript”>

    // Load the Visualization API and the piechart package.

    google.load(‘visualization’, ‘1’, {‘packages’:[‘corechart’]});

    // Set a callback to run when the Google Visualization API is loaded.

    google.setOnLoadCallback(drawChart);

    function drawChart() {

      var energyDataArray = $.ajax({

          url: “getEnergyData.php”,

          dataType:”json”,

          async: false

          }).responseText;

      // Create our data table out of JSON data loaded from server.

     var data = new google.visualization.DataTable(energyDataArray);

       var options = {

          title: ‘Energy Usage (kWh)’

        };

      //var data = google.visualization.arrayToDataTable(energyDataArray);

      // Instantiate and draw our chart, passing in some options.

      var chart = new google.visualization.LineChart(document.getElementById(‘chart_div’));

      chart.draw(data, options);

    }

    </script>

  </head>

  <body>

    <div id=”chart_div” style=”width: 700px; height: 500px;”></div>

  </body>

</html>

Building a Semantic Web Personal Dashboard

image

I’m taking up a pet project to develop a personal data dashboard that I will make partially public on my blog.  I’m challenging myself to do this as I was looking for a achievable project to undertake using Semantic Web technologies.  Here are the data sources, some of which are manual, that I’m considering using:

  • Energy Usage Data (SOURCE: Power Bills, I wish NEST were to give me specifics on this in a automated way)
  • Personal Health Information including my weight (SOURCE: Fitbit scale)
  • Social Trends Information (SOURCE: Facebook and Twitter)
  • Personal finances (likely will hide the Y-axis!) such as net worth or retirement savings (SOURCE: iBank)
  • Average TV Usage (SOURCE: My Home Automation System)

Here are the details for the plan:

  • I will use RDF to store my data in a flat file
  • I will use Fuseki from the Apache Jena project to serve that data using SPARQL
  • I will use PHP to query the information using easyRDF libraries which will be converted to JSON
  • I will use Google Charts to produce the dashboard, and JQuery to load the information asynchronously
  • I’ll host this on my Synology Server (hopefully without having to keep a VM running on my iMac to successfully keep it hosted, but we will see)

I look forward to showing my colleagues at Octo Consulting my progress.  I know some of the developers there may have some suggestions on how to best work with JSON, something I’m not very experienced with.  Wish me luck, I’ll keep you posted on progress.

Semantic Web Activities

I’ve been going through “Semantic Web for the Working Ontologist: Effective Modeling in RDFS and OWL” (amazon link) for quite some time.  Today, I am paring that with a tutorial from Matthew Horridge.  It is quite amazing how primitive the tools remain for the semantic web, I believe it adds to the barrier of entry for most established technical folks who look at the semantic web like it’s a foreign language.

This is the future of healthcare data, and what will pave the way for future health IT systems initiatives, I’m almost convinced.  I’ll post more on my progress.

Economics of the Semantic Web

The past few weeks we have heard about Twitter beginning to limit their programming interface (example 1, example 2, …) for third party applications.  The reason being that they want more visitors to their own website to capitalize on ad revenue.

Tumblr is now considering their ad model.  A blog/social network (that I rulictantly recently started using use as well).

The unfortunate reality of selling online advertising is that it has to be done in a rich user experience environment to be effective.  Therefore, when a company engages in selling their valuable web real-estate, the goal is to drive as much traffic to your website as possible to capitalize on that ad revenue.  Websites then tend to engage in practices such as limiting their RSS feed with “Read More…”, external APIs, and more.

My question is - Is there a model that can promote information sharing and linking on an internet that is advertisement supported?  If each individual website is only concerned with bringing traffic to their own fiefdom, then how will we enable websites that can mash content from different sources.  This has been the goal of the semantic web, a approach supported by the W3C since 2011 but has been met with limited success.

You can’t blame Twitter for protecting their data, it has economic value.  Would an open source alternative to Twitter be the answer?  Social Media and Social Networking are not the only areas lacking semantic linking.  Would this model work for other domains?

This argument could likely be tied to net neutrality, which would allow for alternative revenues for websites other than through advertising.  APIs can have revenue streams (and do!), but it seems that they are much less lucrative than selling ads yourself.

Semantic Web

Loading posts...