StarAI 2015

Fifth International Workshop on Statistical Relational AI

The purpose of the Statistical Relational AI (StarAI) workshop is to bring together researchers and practitioners from two fields: logical (or relational) AI and probabilistic (or statistical) AI. These fields share many key features and often solve similar problems and tasks. Until recently, however, research in them has progressed independently with little or no interaction. The fields often use different terminology for the same concepts and, as a result, keeping-up and understanding the results in the other field is cumbersome, thus slowing down research. Our long term goal is to change this by achieving a synergy between logical and statistical AI. As a stepping stone towards realizing this big picture view on AI, we are organizing the Fifth International Workshop on Statistical Relational AI at the 31st Conference on Uncertainty in Artificial Intelligence (UAI) in Amsterdam, The Netherlands, on July 16th 2015.



StarAI will be a one day workshop with around 50 attendees, paper presentations and poster spotlights, a poster session, and four invited speakers:

  • Christopher Ré (Stanford)
  • Kristian Kersting (TU Dortmund)
  • Sebastian Riedel (University College London)


Those interested in attending should submit either a technical paper (AAAI style, 6 pages without references) or a position statement (AAAI style, 2 pages maximum) in PDF format via EasyChair. All submitted papers will be carefully peer-reviewed by multiple reviewers and low-quality or off-topic papers will be rejected. Papers will be selected either for a short oral presentation or a poster presentation.

Important Dates

  • Paper Submission: May 29
  • Notification of Acceptance: June 15
  • Camera-Ready Papers: July 10
  • Date of Workshop: July 16



  •  8:55 a.m.: Welcome and introduction
  •  9:00 a.m.: Invited talk by Kristian Kersting
    Title: The Democratization of Optimization

    Democratizing data does not mean dropping a huge spreadsheet on everyone’s desk and saying, "good luck," it means to make data mining, machine learning and AI methods useable in such a way that people can easily instruct machines to have a "look" at the data and help them to understand and act on it. A promising approach is the declarative "Model + Solver" paradigm that was and is behind many revolutions in computing in general: instead of outlining how a solution should be computed, we specify what the problem is using some modeling language and solve it using highly optimized solvers. Analyzing data, however, involves more than just the optimization of an objective function subject to constraints. Before optimization can take place, a large effort is needed to not only formulate the model but also to put it in the right form. We must often build models before we know what individuals are in the domain and, therefore, before we know what variables and constraints exist. Hence modeling should facilitate the formulation of abstract, general knowledge. This not only concerns the syntactic form of the model but also needs to take into account the abilities of the solvers; the efficiency with which the problem can be solved is to a large extent determined by the way the model is formalized. In this talk, I shall review our recent efforts on relational linear programming. It can reveal the rich logical structure underlying many AI and data mining problems both at the formulation as well as the optimization level. Ultimately, it will make optimization several times easier and more powerful than current approaches and is a step towards achieving the grand challenge of automated programming as sketched by Jim Gray in his Turing Award Lecture.

    Joint work with Martin Mladenov and Pavel Tokmakov and based on previous joint works together with Babak Ahmadi, Amir Globerson, Martin Grohe, Fabian Hadiji, Marion Neumann, Aziz Erkal Selman, and many more.

  • 10:00 a.m.: Poster spotlights for papers 1 to 5
  • 10:30 a.m.: Coffee break
  • 11:00 a.m.: Invited talk by Sebastian Riedel
    Title: Embedding Probabilistic Logic for Machine Reading

    We want to build machines that read, and make inferences based on what was read. A long line of the work in the field has focussed on approaches where language is converted (possibly using machine learning) into a symbolic and relational representation. A reasoning algorithm (such as a theorem prover) then derives new knowledge from this representation. This allows for rich knowledge to captured, but generally suffers from two problems: acquiring sufficient symbolic background knowledge and coping with noise and uncertainty in data. Probabilistic logics (such as Markov Logic) offer a solution, but are known to often scale poorly.

    In recent years a third alternative emerged: latent variable models in which entities and relations are embedded in vector spaces (and represented “distributional”). Such approaches scale well and are robust to noise, but they raise their own set of questions: What type of inferences do they support? What is a proof in embeddings? How can explicit background knowledge be injected into embeddings? In this talk I first present our work on latent variable models for machine reading, using ideas from matrix factorisation as well as both closed and open information extraction. Then I will present recent work we conducted to address the questions of injecting and extracting symbolic knowledge into/from models based on embeddings. In particular, I will show how one can rapidly build accurate relation extractors through combining logic and embeddings.

  • 12:00 p.m.: Poster spotlights for papers 6 to 10
  • 12:30 p.m.: Lunch break


  • 2:30 p.m.: Invited talk by Christopher Ré
    Title: DeepDive: A Data System for Macroscopic Science

    Many pressing questions in science are macroscopic, as they require scientists to integrate information from numerous data sources, often expressed in natural languages or in graphics; these forms of media are fraught with imprecision and ambiguity and so are difficult for machines to understand. Here I describe DeepDive, which is a new type of system designed to cope with these problems. It combines extraction, integration and prediction into one system. For some paleobiology and materials science tasks, DeepDive-based systems have surpassed human volunteers in data quantity and quality (recall and precision). DeepDive is also used by scientists in areas including genomics and drug repurposing, by a number of companies involved in various forms of search, and by law enforcement in the fight against human trafficking. DeepDive does not allow users to write algorithms; instead, it asks them to write only features.

    DeepDive is open source on github and available from

  • 3:30 p.m.: Poster spotlights for papers 11 to 16
  • 4:00 p.m.: Poster session (with coffee)
  • 6:00 p.m.: End

Accepted Papers


Organizing Committee

For comments, queries and suggestions, please contact:
  • Mathias Niepert (University of Washington)
  • Guy Van den Broeck (KU Leuven)
  • Sriraam Natarajan (Indiana University)
  • David Poole (University of British Columbia)

Program Committee

  • Babak Ahmadi (Fraunhofer IAIS)
  • Rodrigo Braz (SRI International)
  • Maximilian Nickel (Ludwig Maximilian University Munich)
  • Jesse Davis (KU Leuven)
  • Sameer Singh (University of Washington)
  • Pedro Domingos (University of Washington)
  • Tushar Khot (University Of Wisconsin-Madison)
  • Arthur Choi (UCLA)
  • Jude Shavlik (University of Wisconsin -- Madison)
  • Kristian Kersting (TU Dortmund)
  • Brian Milch (Google)
  • Kee Siong Ng (Pivotal)
  • Daniel Lowd (University of Oregon)
  • Scott Sanner (NICTA)
  • Angelika Kimmig (KU Leuven)
  • Hendrik Blockeel (KU Leuven)
  • Jaesik Choi (Ulsan National Institute of Science and Technology)
  • Taisuke Sato (Tokyo Institute of Technology)
  • Manfred Jaeger (Aalborg University)


StarAI is currently provoking a lot of new research and has tremendous theoretical and practical implications. Theoretically, combining logic and probability in a unified representation and building general-purpose reasoning tools for it has been the dream of AI, dating back to the late 1980s. Practically, successful StarAI tools will enable new applications in several large, complex real-world domains including those involving big data, social networks, natural language processing, bioinformatics, the web, robotics and computer vision. Such domains are often characterized by rich relational structure and large amounts of uncertainty. Logic helps to effectively handle the former while probability helps her effectively manage the latter. We seek to invite researchers in all subfields of AI to attend the workshop and to explore together how to reach the goals imagined by the early AI pioneers.

The focus of the workshop will be on general-purpose representation, reasoning and learning tools for StarAI as well as practical applications. Specifically, the workshop will encourage active participation from researchers in the following communities: satisfiability (SAT), knowledge representation (KR), constraint satisfaction and programming (CP), (inductive) logic programming (LP and ILP), graphical models and probabilistic reasoning (UAI), statistical learning (NIPS and ICML), graph mining (KDD and ECML PKDD) and probabilistic databases (VLDB and SIGMOD). It will also actively involve researchers from more applied communities, such as natural language processing (ACL and EMNLP), information retrieval (SIGIR, WWW and WSDM), vision (CVPR and ICCV), semantic web (ISWC and ESWC) and robotics (RSS and ICRA).

Previous Workshops

Previous StarAI workshops were held in conjunction with AAAI 2010, UAI 2012, AAAI 2013, and AAAI 2014, and were among the most popular workshops at the conferences.