BEGIN:VCALENDAR
VERSION:2.0
PRODID:-//wp-events-plugin.com//7.2.3.1//EN
BEGIN:VEVENT
UID:56@lincs.fr
DTSTART;TZID=Europe/Paris:20160418T150000
DTEND;TZID=Europe/Paris:20160418T160000
DTSTAMP:20170313T170917Z
URL:https://www.lincs.fr/events/data-cleaning-in-the-big-data-era/
SUMMARY:Data Cleaning in the Big Data era @ Barrault (Amphi Rubis)
DESCRIPTION:\n\n\nAbstract:\nIn the Ã¢â‚¬Å“big dataÃ¢â‚¬Â
 era\, data is often dirty in nature because of several reasons\, such as
 typos\, missing values\, and duplicates. The intrinsic problem with dirty
 data is that it can lead to poor results in analytic tasks. Therefore\,
 data cleaning is an unavoidable task in data preparation to have reliable
 data for final applications\, such as querying and mining. Unfortunately\,
 data cleaning is hard in practice and it requires a great amount of manual
 work. Several systems have been proposed to increase automation and
 scalability in the process. They rely on a formal\, declarative approach
 based on first order logic: users provide high-level specifications of
 their tasks\, and the systems compute optimal solutions without human
 intervention on the generated code. However\, traditional
 Ã¢â‚¬Ëœtop-downÃ¢â‚¬â„¢ cleaning approaches quickly
 become unpractical when dealing with the complexity and variety found in
 big data.In this talk\, we first describe recent results in tackling data
 cleaning with a declarative approach. We then discuss how this experience
 has pushed several groups to propose new systems that recognize the central
 role of the users in cleaning big data.\n\n\nBiography:\nPaolo Papotti is
 an Assistant Professor of Computer Science in the School of Computing\,
 Informatics\, and Decision Systems Engineering (CIDSE) at Arizona State
 University. He got his Ph.D. in Computer Science at
 UniversitaÃ¢â‚¬â„¢ degli Studi Roma Tre (2007\, Italy) and before
 joining ASU he had been a senior scientist at Qatar Computing Research
 Institute. His research is focused on systems that assist users in
 complex\, necessary tasks and that scale to large datasets with efficient
 algorithms and distributed platforms. His work has been recognized with two
 Ã¢â‚¬Å“Best of the ConferenceÃ¢â‚¬Â citations (SIGMOD
 2009\, VLDB 2015) and with a best demo award at SIGMOD 2015. He is group
 leader for SIGMOD 2016 and associate editor for the ACM Journal of Data and
 Information Quality (JDIQ).\n\n\n\n&nbsp\;
CATEGORIES:Seminars
END:VEVENT
BEGIN:VTIMEZONE
TZID:Europe/Paris
X-LIC-LOCATION:Europe/Paris
BEGIN:DAYLIGHT
DTSTART:20160327T030000
TZOFFSETFROM:+0100
TZOFFSETTO:+0200
TZNAME:CEST
END:DAYLIGHT
END:VTIMEZONE
END:VCALENDAR