An Embedded DSL for High Performance Declarative Communication with Correctness Guarantees in C++

Mahajan, Nilesh; Holk, Eric; Chauhan, Arun; Lumsdaine, Andrew

doi:10.1007/978-3-319-29778-1_13

Nilesh Mahajan¹⁶,
Eric Holk¹⁶,
Arun Chauhan¹⁶ &
…
Andrew Lumsdaine¹⁶

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 9519))

Included in the following conference series:

Languages and Compilers for Parallel Computing

701 Accesses

Abstract

High performance programming using explicit communication calls needs considerable programming expertise to optimize. Tuning for performance often involves using asynchronous calls, running the risk of introducing bugs and making the program harder to debug. Techniques to prove desirable program properties, such as deadlock freedom, invariably incur significant performance overheads.

We have developed a domain-specific language, embedded in C++, called Kanor that enables programmers to specify the communication declaratively in the Bulk Synchronous Parallel (BSP) style. Deadlock freedom is guaranteed for well-formed Kanor programs. We start with operational semantics for a subset of Kanor and prove deadlock freedom and determinism properties based on those semantics. We then show how the declarative nature of Kanor allows us to detect and optimize communication patterns.

A. Chauhan—Currently at Google Inc.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+

from $39.99 /Month

Starting from 10 chapters or articles per month
Access and download chapters and articles from more than 300k books and 2,500 journals
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

A Sound Strategy to Compile General Recursion into Finite Depth Pattern Matching

Context-Dependent Effects in Guarded Interaction Trees

Equivalential Structures for Binary and Ternary Syllogistics

Article 25 September 2017

Notes

1.
This is motivated by the fact that one-sided put operations are usually more efficient than one-sided get operations.

References

Bailey, D.H., Barszcz, E., Barton, J.T., Browning, D.S., Carter, R.L., Fatoohi, R.A., Frederickson, P.O., Lasinski, T.A., Simon, H.D., Venkatakrishnan, V., Weeratunga, S.K.: The NAS parallel benchmarks. Int. J. Supercomput. Appl. 5(3), 63–73 (1991)
Article Google Scholar
Callahan, D., Kennedy, K.: Analysis of interprocedural side effects in a parallel programming environment. J. Parallel Distrib. Comput. 5(5), 517–550 (1988)
Article Google Scholar
Cottam, J.A., Holk, E., Byrd, W.E., Chauhan, A., Lumsdaine, A.: High-level coordination specification: operational semantics for kanor. In: Workshop on Leveraging Abstractions and Semantics in High-Performance Computing (LASH-C; Workshop at PPoPP 2013), February 2013
Google Scholar
Czarnecki, K., O’Donnell, J.T., Striegnitz, J., Taha, W.: DSL implementation in metaocaml, template haskell, and C++. In: Lengauer, C., Batory, D., Blum, A., Odersky, M. (eds.) Domain-Specific Program Generation. LNCS, vol. 3016, pp. 51–72. Springer, Heidelberg (2004)
Chapter Google Scholar
Danalis, A., Pollock, L., Swany, M., Cavazos, J.: Mpi-aware compiler optimizations for improving communication-computation overlap. In Proceedings of the 23rd International Conference on Supercomputing, ICS 2009, pp. 316–325. ACM, New York (2009)
Google Scholar
Gava, F., Fortin, J.: Formal semantics of a subset of the paderborn’s BSPlib. In: Proceedings of the Ninth International Conference on Parallel and Distributed Computing, Applications and Technologies, PDCAT 2008, pp. 269–276. IEEE Computer Society, Washington, DC (2008)
Google Scholar
Gorlatch, S.: Send-receive considered harmful: myths and realities of message passing. ACM Trans. Program. Lang. Syst. 26(1), 47–56 (2004)
Article Google Scholar
Hoefler, T., Schneider. T.: Runtime detection and optimization of collective communication patterns. In: Proceedings of the 21st International Conference on Parallel Architectures and Compilation Techniques (PACT), pp. 263–272. ACM (2012)
Google Scholar
Jiao, F., Mahajan, N., Willcocok, J., Chauhan, A., Lumsdaine, A.: Partial globalization of partitioned address spaces for zero copy communication with shared memory. In: Proc. of the 18th International Conference on High Performance Computing (HiPC) (2011). doi:10.1109/HiPC.2011.6152733
Ragan-Kelley, J., Adams, A., Paris, S., Levoy, M., Amarasinghe, S., Durand, F.: Decoupling algorithms from schedules for easy optimization of image processing pipelines. ACM Trans. Graph. 31(4), 32:1–32:12 (2012)
Article Google Scholar
Valiant, L.G.: Bulk-synchronous parallel computers. In: Reeve, M. (ed.) Parallel Processing and Artificial Intelligence, pp. 15–22. John Wiley & Sons (1989)
Google Scholar

Download references

Author information

Authors and Affiliations

Indiana University, Bloomington, IN, 47405, USA
Nilesh Mahajan, Eric Holk, Arun Chauhan & Andrew Lumsdaine

Authors

Nilesh Mahajan
View author publications
Search author on:PubMed Google Scholar
Eric Holk
View author publications
Search author on:PubMed Google Scholar
Arun Chauhan
View author publications
Search author on:PubMed Google Scholar
Andrew Lumsdaine
View author publications
Search author on:PubMed Google Scholar

Corresponding author

Correspondence to Nilesh Mahajan .

Editor information

Editors and Affiliations

North Carolina State University, Raleigh, North Carolina, USA
Xipeng Shen
North Carolina State University, Raleigh, North Carolina, USA
Frank Mueller
North Carolina State University, Raleigh, North Carolina, USA
James Tuck

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Mahajan, N., Holk, E., Chauhan, A., Lumsdaine, A. (2016). An Embedded DSL for High Performance Declarative Communication with Correctness Guarantees in C++. In: Shen, X., Mueller, F., Tuck, J. (eds) Languages and Compilers for Parallel Computing. LCPC 2015. Lecture Notes in Computer Science(), vol 9519. Springer, Cham. https://doi.org/10.1007/978-3-319-29778-1_13

Download citation

DOI: https://doi.org/10.1007/978-3-319-29778-1_13
Published: 20 February 2016
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-29777-4
Online ISBN: 978-3-319-29778-1
eBook Packages: Computer ScienceComputer Science (R0)Springer Nature Proceedings Computer Science

Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Publish with us

Policies and ethics