Dryad and dryadLINQ Presented by yin zhu Aoi22,2013 Slides taken from DryadlINQ project page http://research.microsoft.com/en-us/ projects/dryadling/default. aspx
Dryad and DryadLINQ Presented by Yin Zhu April 22, 2013 Slides taken from DryadLINQ project page: http://research.microsoft.com/en-us/ projects/dryadlinq/default.aspx
Distributed data-Parallel Programming using Dryad EuroSysO7 Andrew birrell. Mihai budiu Dennis Fetterly, Michael Isard, Yuan Yu Microsoft Research Silicon valley
Distributed Data-Parallel Programming using Dryad, EuroSys’07 Andrew Birrell, Mihai Budiu, Dennis Fetterly, Michael Isard, Yuan Yu Microsoft Research Silicon Valley
Dryad goals General-purpose execution environment for distributed, data-parallel applications Concentrates on throughput not latency Assumes private data center Automatic management of scheduling distribution fault tolerance etc
Dryad goals • General-purpose execution environment for distributed, data-parallel applications – Concentrates on throughput not latency – Assumes private data center • Automatic management of scheduling, distribution, fault tolerance, etc
Talk outline Computational model · Dryad architecture Some case studies DryadLINQ overview · Summary
Talk outline • Computational model • Dryad architecture • Some case studies • DryadLINQ overview • Summary
A typical data-intensive query var logentries= from line in logs where!line Starts With ( #" select new Log Entry(line); var user from access in logentries here access user Endswith(@ ulfar") Ulfar's most select access frequently visited var accesses web pages from access in user group access by access page into pages select new User Page Count("ulfar" pages. Key, pages. Count() var htmaccesses= from access in accesses where access page Endswith ".htm") orderby access. count descending select access
A typical data-intensive query var logentries = from line in logs where !line.StartsWith("#") select new LogEntry(line); var user = from access in logentries where access.user.EndsWith(@"\ulfar") select access; var accesses = from access in user group access by access.page into pages select new UserPageCount("ulfar", pages.Key, pages.Count()); var htmAccesses = from access in accesses where access.page.EndsWith(".htm") orderby access.count descending select access; Ulfar’s most frequently visited web pages
Steps in the query var logentries= from line in logs Go through logs and keep only lines where !line. StartsWith ("# " that are not comments Parse select new Log Entry(line) each line into a LogEntry object var user from access in logentries Go through logentries and keep there access user. EndsWith (@"lulfar") only entries that are accesses select access by ulfar var accesses from access in user group access by access. page into pages select new UserPage Count("ulfar" pages. Key, pages. Count(); var htmaccesses Group ulfar's accesses according from access in accesses where access page EndsWith " htm") to what page they correspond to orderby access. count descending For each page, count the occurrences select access Sort the pages ulfar has accessed according to access frequency
Steps in the query var logentries = from line in logs where !line.StartsWith("#") select new LogEntry(line); var user = from access in logentries where access.user.EndsWith(@"\ulfar") select access; var accesses = from access in user group access by access.page into pages select new UserPageCount("ulfar", pages.Key, pages.Count()); var htmAccesses = from access in accesses where access.page.EndsWith(".htm") orderby access.count descending select access; Go through logs and keep only lines that are not comments. Parse each line into a LogEntry object. Go through logentries and keep only entries that are accesses by ulfar. Group ulfar’s accesses according to what page they correspond to. For each page, count the occurrences. Sort the pages ulfar has accessed according to access frequency
Serial execution var logentrie from line in logs For each line in logs do where !line. StartsWith ("# " select new Log Entry(line) var user from access in logentries For each entry in logentries, do there access user. EndsWith (@"lulfar") select access var accesses from access in user group access by access page into pages select new UserPage Count("ulfar" pages. Key, pages. Count(); var htmaccesses Sort entries in user by page. Then from access in accesses where access page EndsWith " htm") iterate over sorted list, counting orderby access. count descending the occurrences of each page as select access you go Re-sort entries in access by page frequency
Serial execution var logentries = from line in logs where !line.StartsWith("#") select new LogEntry(line); var user = from access in logentries where access.user.EndsWith(@"\ulfar") select access; var accesses = from access in user group access by access.page into pages select new UserPageCount("ulfar", pages.Key, pages.Count()); var htmAccesses = from access in accesses where access.page.EndsWith(".htm") orderby access.count descending select access; For each line in logs, do… For each entry in logentries, do.. Sort entries in user by page. Then iterate over sorted list, counting the occurrences of each page as you go. Re-sort entries in access by page frequency
Parallel execution var logentries= from line in logs where !line. StartsWith ("# " select new Log Entry(line) ○○ var user from access in logentries there access user. EndsWith (@"lulfar") select access var accesses from access in user group access by access page into pages select new User Page Count("ulfar", pages. Key, pages. Coul var htmaccesses from access in accesses where access page EndsWith " htm") orderby access. count descending select access
Parallel execution var logentries = from line in logs where !line.StartsWith("#") select new LogEntry(line); var user = from access in logentries where access.user.EndsWith(@"\ulfar") select access; var accesses = from access in user group access by access.page into pages select new UserPageCount("ulfar", pages.Key, pages.Count()); var htmAccesses = from access in accesses where access.page.EndsWith(".htm") orderby access.count descending select access;
How does Dryad fit in? Many programs can be represented as a distributed execution graph The programmer may not have to know this “ SQL-like” queries:L|NQ Spark(oSDI 12)utilizes the same idea Dryad will run them for you
How does Dryad fit in? • Many programs can be represented as a distributed execution graph – The programmer may not have to know this • “SQL-like” queries: LINQ – Spark (OSDI’12) utilizes the same idea. • Dryad will run them for you
Talk outline Computational model · Dryad architecture Some case studies DryadLINQ overview Summary
Talk outline • Computational model • Dryad architecture • Some case studies • DryadLINQ overview • Summary