Brian J. N. Wylie, Judit Giménez, Christian Feld, Markus Geimer, Germán Llort, Sandra Mendez, Estanislao Mercadal, Anke Visser, Marta García-Gasulla: 15+ years of joint parallel application performance analysis/tools training with Scalasca/Score-P and Paraver/Extrae toolsets. Future Generation Computer Systems, 162:Article No. 107472, 13 pages, January 2025.
URL DOI BibTeX
Isabel Thärigen, Marc-André Hermanns, Markus Geimer: An Event Model for Trace-Based Performance Analysis of MPI Partitioned Point-to-Point Communication. In Proc. of the Workshop on Programming and Performance Visualization Tools (ProTools), held in conjunction with the Supercomputing Conference (SC23), Denver, CO, USA, pages 1357–1367, ACM, November 2023.
URL DOI BibTeX
Christian Feld, Markus Geimer, Marc-André Hermanns, Pavel Saviankou, Anke Visser, Bernd Mohr: Detecting Disaster Before It Strikes: On the Challenges of Automated Building and Testing in HPC Environments. In Tools for High Performance Computing 2018 / 2019, pages 3-26, Springer International Publishing, 2021.
URL DOI BibTeX
Brian J. N. Wylie: Exascale potholes for HPC: Execution performance and variability analysis of the flagship application code HemeLB. In Proc. of 2020 IEEE/ACM International Workshop on HPC User Support Tools (HUST) and the Workshop on Programming and Performance Visualization Tools (ProTools), held in conjunction with the Supercomputing Conference (SC20), pages 59–70, IEEE, November 2020.
URL DOI BibTeX
Marcus Ritter, Alexandru Calotoiu, Sebastian Rinke, Thorsten Reimann, Torsten Hoefler, Felix Wolf: Learning Cost-Effective Sampling Strategies for Empirical Performance Modeling. In Proc. of the 34th IEEE International Parallel and Distributed Processing Symposium (IPDPS), New Orleans, LA, USA, pages 884–895, IEEE, May 2020.
PDF DOI BibTeX
Christian Feld, Simon Convent, Marc-André Hermanns, Joachim Protze, Markus Geimer, Bernd Mohr: Score-P and OMPT: Navigating the Perils of Callback-Driven Parallel Runtime Introspection. In Proc. of the 15th International Workshop on OpenMP (IWOMP 2019, September 11–13, 2019, Auckland, New Zealand), volume 11718 of Lecture Notes in Computer Science, pages 21–35, Springer, Cham, 2019.
PDF DOI BibTeX
Marc-André Hermanns, Nathan T. Hjelm, Michael Knobloch, Kathryn Mohror, Martin Schulz: The MPI_T events interface: An early evaluation and overview of the interface. Parallel Computing, 85:119 - 130, 2019.
PDF URL DOI BibTeX
Jan-Patrick Lehr, Alexandru Calotoiu, Christian Bischof, Felix Wolf: Automatic Instrumentation Refinement for Empirical Performance Modeling. In Proc. of the Workshop on Programming and Performance Visualization Tools (ProTools), held in conjunction with the Supercomputing Conference (SC19), Denver, CO, USA, pages 40–47, November 2019.
PDF DOI BibTeX
Alexandru Calotoiu, Thomas Höhl, Heiko Mantel, Toni Nguyen, Felix Wolf: Designing Efficient Parallel Software via Compositional Performance Modeling. In Proc. of the Workshop on Programming and Performance Visualization Tools (ProTools), held in conjunction with the Supercomputing Conference (SC19), Denver, CO, USA, pages 17–24, November 2019.
PDF DOI BibTeX
Marc Schlütter, Christian Feld, Pavel Saviankou, Michael Knobloch, Marc-André Hermanns, Bernd Mohr: SCIPHI Score-P and Cube Extensions for Intel Phi. In Tools for High Performance Computing 2017, pages 85-104, Cham, Springer International Publishing, September 2019.
PDF DOI BibTeX
Sergei Shudler, Yannick Berens, Alexandru Calotoiu, Torsten Hoefler, Alexandre Strube, Felix Wolf: Engineering Algorithms for Scalability through Continuous Validation of Performance Expectations. IEEE Transactions on Parallel and Distributed Systems, 30(8):1768–1785, August 2019.
PDF DOI BibTeX
Aamer Shah, Chihsong Kuo, Akihiro Nomura, Satoshi Matsuoka, Felix Wolf: How File-access Patterns Influence the Degree of I/O Interference between Cluster Applications. Supercomputing Frontiers and Innovations, 6(2):29–55, July 2019.
PDF DOI BibTeX
Marc-André Hermanns: Understanding the formation of wait states in one-sided communication. PhD thesis, RWTH Aachen University, Jülich, 2018.
URL DOI BibTeX
Philip C. Roth, Kevin Huck, Ganesh Gopalakrishnan, Felix Wolf: Using Deep Learning for Automated Communication Pattern Characterization: Little Steps and Big Challenges. In Proc. of the 5th Workshop on Visual Performance Analysis (VPA), held in conjunction with the Supercomputing Conference (SC18), Dallas, TX, USA, volume 11027 of Lecture Notes in Computer Science, pages 265–272, Springer, November 2018.
PDF DOI BibTeX
Sergei Shudler, Jadran Vrabec, Felix Wolf: Understanding the Scalability of Molecular Simulation using Empirical Performance Modeling. In Proc. of the 7th Workshop on Extreme Scale Programming Tools (ESPT), held in conjunction with the Supercomputing Conference (SC18), Dallas, TX, USA, volume 11027 of Lecture Notes in Computer Science, pages 125–143, Springer, November 2018.
PDF DOI BibTeX
Michael Burger, Christian Bischof, Alexandru Calotoiu, Felix Wolf, Thomas Wunderer, Johannes Buchmann: Exploring the Performance Envelope of the LLL Algorithm. In CSE 2018 - 21st IEEE International Conference of Computational Science and Engineering, Faculty of Automatic Control and Computers, University Politehnica of Bucharest, Romania, pages 36–43, IEEE, October 2018.
PDF DOI BibTeX
Alexandru Calotoiu, Alexander Graf, Torsten Hoefler, Daniel Lorenz, Sebastian Rinke, Felix Wolf: Lightweight Requirements Engineering for Exascale Co-design. In Proc. of the 2018 IEEE International Conference on Cluster Computing (CLUSTER), Belfast, UK, pages 201–211, IEEE, September 2018.
PDF DOI BibTeX
Marc-André Hermanns, Nathan T. Hjelm, Michael Knobloch, Kathryn Mohror, Martin Schulz: Enabling callback-driven runtime introspection via MPI_T. In 25th European MPI Users' Group Meeting (EuroMPI'18), September 23-26, 2018, Barcelona, Spain, New York, NY, USA, ACM, September 2018.
DOI BibTeX
Aamer Shah, Matthias S. Müller, Felix Wolf: Estimating the Impact of External Interference on Application Performance. In Proc. of the 24th Euro-Par Conference, Turin, Italy, volume 11014 of Lecture Notes in Computer Science, pages 46–58, Springer, August 2018.
PDF DOI BibTeX
Wendy Sharples, Ilya Zhukov, Markus Geimer, Klaus Görgen, Sebastian Lührs, Thomas Breuer, Bibi Naz, Ketan Kulkarni, Slavko Brdar, Stefan Kollet: A run control framework to streamline profiling, porting, and tuning simulation runs and provenance tracking of geoscientific applications. Geoscientific Model Development, 11(7):2875–2895, July 2018.
DOI BibTeX
Sergei Shudler: Scalability Engineering for Parallel Programs Using Empirical Performance Models. PhD thesis, Technische Universität Darmstadt, Darmstadt, Germany, June 2018.
URL BibTeX
Marc-André Hermanns, Markus Geimer, Bernd Mohr, Felix Wolf: Trace-based Detection of Lock Contention in MPI One-Sided Communication. In Tools for High Performance Computing 2016, Proc. of the 10th Parallel Tools Workshop, Stuttgart, Germany, October 2016, pages 97–114, Springer, 2017.
URL DOI BibTeX
Daniele Tafani, Marc Schlütter, Markus Geimer, Bernd Mohr, Mathias Nachtmann, José Gracia: The Mont-Blanc Project: Second Phase successfully finished. Innovatives Supercomputing in Deutschland (inSiDE), 15(1):134–141, 2017.
URL BibTeX
Alexandru Calotoiu: Automatic Empirical Performance Modeling of Parallel Programs. PhD thesis, Technische Universität Darmstadt, Darmstadt, Germany, October 2017.
URL BibTeX
Patrick Reisert, Alexandru Calotoiu, Sergei Shudler, Felix Wolf: Following the Blind Seer – Creating Better Performance Models Using Less Information. In Proc. of the 23rd Euro-Par Conference, Santiago de Compostela, Spain, volume 10417 of Lecture Notes in Computer Science, pages 106–118, Springer, August 2017.
PDF DOI BibTeX
Kashif Ilyas, Alexandru Calotoiu, Felix Wolf: Off-Road Performance Modeling – How to Deal with Segmented Data. In Proc. of the 23rd Euro-Par Conference, Santiago de Compostela, Spain, volume 10417 of Lecture Notes in Computer Science, pages 36–48, Springer, August 2017.
PDF DOI BibTeX
Daniel Lorenz, Christian Feld: Scaling Score-P to the next level. In Proc. of the International Converence of Computational Science Workshops, pages 2180–-2189, Elsevier, June 2017.
PDF DOI BibTeX
Hristo Iliev, Marc-André Hermanns, Jens Henrik Göbbert, René Halver, Christian Terboven, Bernd Mohr, Matthias S. Müller: Performance Optimization of Parallel Applications in Diverse On-Demand Development Teams. In High-Performance Scientific Computing – First JARA-HPC Symposium 2016, October 4–5, 2016, Aachen, Germany, volume 10164 of Lecture Notes in Computer Science, pages 187–199, Springer International Publishing, March 2017.
URL DOI BibTeX
Sergei Shudler, Alexandru Calotoiu, Torsten Hoefler, Felix Wolf: Isoefficiency in Practice: Configuring and Understanding the Performance of Task-based Applications. In Proc. of the ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (PPoPP), Austin, TX, USA, pages 131–143, ACM, February 2017.
PDF DOI BibTeX
Tom Vierjahn, Marc-André Hermanns, Bernd Mohr, Matthias S. Müller, Torsten W. Kuhlen, Bernd Hentschel: Using Directed Variance to Identify Meaningful Views in Call-path Performance Profiles. In Proceedings of the 3rd International Workshop on Visual Performance Analysis of VPA '16, pages 9–16, Piscataway, NJ, USA, IEEE Press, 2016.
URL DOI BibTeX
Alexandru Calotoiu, David Beckingsale, Christopher W. Earl, Torsten Hoefler, Ian Karlin, Martin Schulz, Felix Wolf: Fast Multi-Parameter Performance Modeling. In Proc. of the 2016 IEEE International Conference on Cluster Computing (CLUSTER), Taipei, Taiwan, pages 172–181, IEEE, September 2016.
PDF DOI BibTeX
David Böhme, Markus Geimer, Lukas Arnold, Felix Voigtländer, Felix Wolf: Identifying the root causes of wait states in large-scale parallel applications. ACM Transactions on Parallel Computing, 3(2):Article No. 11, 24 pages, July 2016.
PDF DOI BibTeX
Monika Harlacher, Alexandru Calotoiu, John Dennis, Felix Wolf: Analysing the Scalability of Climate Codes Using New Features of Scalasca. In Proc. of the John von Neumann Institute for Computing (NIC) Symposium 2016, Juelich, Germany, volume 48 of NIC Series, pages 343–352. Forschungszentrum Jülich, John von Neumann-Institut for Computing, February 2016.
BibTeX
Ilya Zhukov, Christian Feld, Markus Geimer, Michael Knobloch, Bernd Mohr, Pavel Saviankou: Scalasca v2: Back to the Future. In Proc. of Tools for High Performance Computing 2014, pages 1-24, Springer, 2015.
DOI BibTeX
Laura von Rüden, Marc-André Hermanns, Michael Behrisch, Daniel Keim, Bernd Mohr, Felix Wolf: Separating the Wheat from the Chaff: Identifying Relevant and Similar Performance Data with Visual Analytics. In Proc. of the 2nd Workshop on Visual Performance Analysis (VPA), held in conjunction with the Supercomputing Conference (SC15), Austin, TX, USA, pages 4:1–4:8, ACM, 2015.
PDF DOI BibTeX
Daniel Lorenz, Sergei Shudler, Felix Wolf: Preventing the explosion of exascale profile data with smart thread-level aggregation. In Proc. of the 4th Workshop on Extreme Scale Programming Tools (ESPT), held in conjunction with the Supercomputing Conference (SC15), Austin, TX, USA, pages 1–10, ACM, November 2015.
PDF DOI BibTeX
Andreas Vogel, Alexandru Calotoiu, Alexandre Strube, Sebastian Reiter, Arne Nägel, Felix Wolf, Gabriel Wittum: 10,000 Performance Models per Minute - Scalability of the UG4 Simulation Framework. In Proc. of the 21st Euro-Par Conference, Vienna, Austria, volume 9233 of Lecture Notes in Computer Science, pages 519–531, Springer, August 2015.
PDF DOI BibTeX
Christian Iwainsky, Sergei Shudler, Alexandru Calotoiu, Alexandre Strube, Michael Knobloch, Christian Bischof, Felix Wolf: How Many Threads will be too Many? On the Scalability of OpenMP Implementations. In Proc. of the 21st Euro-Par Conference, Vienna, Austria, volume 9233 of Lecture Notes in Computer Science, pages 451–463, Springer, August 2015.
PDF DOI BibTeX
Sergei Shudler, Alexandru Calotoiu, Torsten Hoefler, Alexandre Strube, Felix Wolf: Exascaling Your Library: Will Your Implementation Meet Your Expectations?. In Proc. of the International Conference on Supercomputing (ICS), Newport Beach, CA, USA, pages 165–175, ACM, June 2015.
PDF DOI BibTeX
Pavel Saviankou, Michael Knobloch, Anke Visser, Bernd Mohr: Cube v4: From Performance Report Explorer to Performance Analysis Tool. Procedia Computer Science, 51:1343–1352, June 2015.
PDF DOI BibTeX
Jie Jiang, Peter Philippen, Michael Knobloch, Bernd Mohr: Performance Measurement and Analysis of Transactional Memory and Speculative Execution on IBM Blue Gene/Q. In Proceedings of Euro-Par 2014 Parallel Processing, volume 8632 of Lecture Notes in Computer Science, pages 26-37, Springer International Publishing, 2014.
PDF URL DOI BibTeX
Christian Rössel, Bernd Mohr, Markus Geimer, Daniel Becker: Successful Technology Transfer with Siemens – The RAPID Project. Innovatives Supercomputing in Deutschland (inSiDE), 12(3):72–75, 2014.
URL BibTeX
Fabian Gasper, Klaus Görgen, Prabhakar Shrestha, Mauro Sulis, Jehan Rihani, Markus Geimer, Stefan Kollet: Implementation and scaling of the fully coupled Terrestrial Systems Modeling Platform (TerrSysMP v1.0) in a massively parallel supercomputing environment – a case study on JUQUEEN (IBM Blue Gene/Q). Geoscientific Model Development, 7(5):2531–2543, October 2014.
PDF URL DOI BibTeX
Daniel Lorenz, Robert Dietrich, Ronny Tschüter, Felix Wolf: A comparison between OPARI2 and the OpenMP tools interface in the context of Score-P. In Proc. of the 10th International Workshop on OpenMP (IWOMP), Salvador, Brazil, September 2014, volume 8766 of LNCS, pages 161–172, Springer, September 2014.
PDF DOI BibTeX
Gouyong Mao, David Böhme, Marc-André Hermanns, Markus Geimer, Daniel Lorenz, Felix Wolf: Catching Idlers with Ease: A Lightweight Wait-State Profiler for MPI Programs. In EuroMPI '14: Proc. of the 21th European MPI Users' Group Meeting, Kyoto, Japan, pages 103–108, ACM, September 2014.
PDF DOI BibTeX
Chihsong Kuo, Aamer Shah, Akihiro Nomura, Satoshi Matsuoka, Felix Wolf: How File Access Patterns Influence Interference Among Cluster Applications. In Proc. of the IEEE International Conference on Cluster Computing (CLUSTER), Madrid, Spain, pages 1–8, IEEE, September 2014.
PDF DOI BibTeX
Felix Wolf, Christian Bischof, Torsten Hoefler, Bernd Mohr, Gabriel Wittum, Alexandru Calotoiu, Christian Iwainsky, Alexandre Strube, Andreas Vogel: Catwalk: A Quick Development Path for Performance Models. In Euro-Par 2014: Parallel Processing Workshops, volume 8805, 8806 of Lecture Notes in Computer Science, Springer, September 2014.
DOI BibTeX
Alexandru Calotoiu, Torsten Hoefler, Felix Wolf: Mass-producing Insightful Performance Models. In Workshop on Modeling & Simulation of Systems and Applications, University of Washington, Seattle, Washington, August 2014.
PDF URL BibTeX
Marc Schlütter, Peter Philippen, Laurent Morin, Markus Geimer, Bernd Mohr: Profiling Hybrid HMPP Applications with Score-P on Heterogeneous Hardware. In Parallel Computing: Accelerating Computational Science and Engineering (CSE), volume 25 of Advances in Parallel Computing, pages 773 - 782, IOS Press, March 2014.
PDF URL DOI BibTeX
Julien Jaeger, Peter Philippen, Eric Petit, Andres Charif Rubial, Christian Rössel, William Jalby, Bernd Mohr: Binary Instrumentation for Scalable Performance Measurement of OpenMP Applications. In Parallel Computing: Accelerating Computational Science and Engineering (CSE), volume 25 of Advances in Parallel Computing, pages 783-792, IOS Press, March 2014.
URL DOI BibTeX
David Böhme: Characterizing Load and Communication Imbalance in Parallel Applications. PhD thesis, RWTH Aachen University, volume 23 of IAS Series, Forschungszentrum Jülich, February 2014, ISBN 978-3-89336-940-9.
URL DOI BibTeX
Ilya Zhukov, Brian J. N. Wylie: Assessing Measurement and Analysis Performance and Scalability of Scalasca 2.0. In Proc. of the Euro-Par 2013: Parallel Processing Workshops, volume 8374 of Lecture Notes in Computer Science, pages 627-636, Springer, January 2014.
PDF DOI BibTeX
Andreas Knüpfer, Robert Dietrich, Jens Doleschal, Markus Geimer, Marc-André Hermanns, Christian Rössel, Ronny Tschüter, Bert Wesarg, Felix Wolf: Generic Support for Remote Memory Access Operations in Score-P and OTF2. In Tools for High Performance Computing 2012, Proc. of the 6th Parallel Tools Workshop, Stuttgart, Germany, September 2012, pages 57–74, Springer, 2013.
DOI BibTeX
Daniel Lorenz, David Böhme, Bernd Mohr, Alexandre Strube, Zoltán Szebenyi: Extending Scalasca’s Analysis Features. In Tools for High Performance Computing 2012, pages 115–126, Springer Berlin Heidelberg, 2013.
PDF DOI BibTeX
Alexandre E. Eichenberger, John M. Mellor-Crummey, Martin Schulz, Michael Wong, Nawal Copty, John DelSignore, Robert Dietrich, Xu Liu, Eugene Loh, Daniel Lorenz: OMPT: OpenMP Tools Application Programming Interfaces for Performance Analysis. In Proc. of the 9th International Workshop on OpenMP (IWOMP), Canberra, Australia of LNCS, pages 171–185, Berlin / Heidelberg, Springer, 2013.
PDF DOI BibTeX
Bernd Mohr, Vladimir Voevodin, Judit Giménez, Erik Hagersten, Andreas Knüpfer, DmitryA. Nikitenko, Mats Nilsson, Harald Servat, Aamer Shah, Frank Winkler, Felix Wolf, Ilya Zhukov: The HOPSA Workflow and Tools. In Tools for High Performance Computing 2012, Proc. of the 6th Parallel Tools Workshop, Stuttgart, Germany, September 2012, pages 127–146, Springer, 2013.
PDF DOI BibTeX
Alexandru Calotoiu, Torsten Hoefler, Marius Poke, Felix Wolf: Using Automated Performance Modeling to Find Scalability Bugs in Complex Codes. In Proc. of the ACM/IEEE Conference on Supercomputing (SC13), Denver, CO, USA, pages 1–12, ACM, November 2013.
PDF DOI BibTeX
Marc-André Hermanns, Manfred Miklosch, David Böhme, Felix Wolf: Understanding the formation of wait states in applications with one-sided communication. In EuroMPI '13: Proc. of the 20th European MPI Users' Group Meeting, Madrid, Spain, September 15–18, 2013, pages 73–78, New York, NY, USA, ACM, September 2013.
PDF DOI BibTeX
Aamer Shah, Felix Wolf, Sergey Zhumatiy, Vladimir Voevodin: Capturing inter-application interference on clusters. In Proc. of the IEEE International Conference on Cluster Computing (CLUSTER), Indianapolis, IN, USA, pages 1–5, IEEE, September 2013.
PDF DOI BibTeX
Brian J. N. Wylie, Wolfgang Frings: Scalasca support for MPI+OpenMP parallel applications on large-scale HPC systems based on Intel Xeon Phi. In Proc. XSEDE'13 Conference on Extreme Science and Engineering Discovery Environment: Gateway to Discovery (San Diego, CA, USA), ACM, July 2013.
DOI BibTeX
Daniel Becker, Markus Geimer, Rolf Rabenseifner, Felix Wolf: Extending the scope of the controlled logical clock. Cluster Computing, 16(1):171–189, March 2013.
PDF DOI BibTeX
Marc-André Hermanns, Sriram Krishnamoorthy, Felix Wolf: A scalable infrastructure for the performance analysis of passive target synchronization. Parallel Computing, 39(3):132–145, March 2013.
PDF DOI BibTeX
Markus Geimer, Pavel Saviankou, Alexandre Strube, Zoltán Szebenyi, Felix Wolf, Brian J. N. Wylie: Further improving the scalability of the Scalasca toolset. In Proc. of PARA 2010: State of the Art in Scientific and Parallel Computing, Part II: Minisymposium Scalable tools for High Performance Computing, Reykjavik, Iceland, June 6–9 2010, volume 7134 of Lecture Notes in Computer Science, pages 463–474, Springer, 2012.
PDF DOI BibTeX
Dieter an Mey, Scott Biersdorff, Christian Bischof, Kai Diethelm, Dominic Eschweiler, Michael Gerndt, Andreas Knüpfer, Daniel Lorenz, Allen D. Malony, Wolfgang E. Nagel, Yury Oleynik, Christian Rössel, Pavel Saviankou, Dirk Schmidl, Sameer S. Shende, Michael Wagner, Bert Wesarg, Felix Wolf: Score-P: A Unified Performance Measurement System for Petascale Applications. In Proc. of the CiHPC: Competence in High Performance Computing, HPC Status Konferenz der Gauß-Allianz e.V., Schwetzingen, Germany, June 2010, pages 85–97. Gauß-Allianz, Springer, 2012.
PDF DOI BibTeX
Dominic Eschweiler, Michael Wagner, Markus Geimer, Andreas Knüpfer, Wolfgang E. Nagel, Felix Wolf: Open Trace Format 2 - The Next Generation of Scalable Trace Formats and Support Libraries. In Proc. of the Intl. Conference on Parallel Computing (ParCo), Ghent, Belgium, August 30 – September 2 2011, volume 22 of Advances in Parallel Computing, pages 481–490, IOS Press, 2012.
PDF DOI BibTeX
Ulf Andersson, Brian J. N. Wylie: Performance engineering of GemsFDTD computational electromagnetics solver. In Proc. of PARA 2010:State of the Art in Scientific and Parallel Computing, Reykjavík, Iceland, Part I, volume 7133 of Lecture Notes in Computer Science, pages 314-324, Springer, 2012.
PDF DOI BibTeX
Andreas Knüpfer, Christian Rössel, Dieter an Mey, Scott Biersdorff, Kai Diethelm, Dominic Eschweiler, Markus Geimer, Michael Gerndt, Daniel Lorenz, Allen D. Malony, Wolfgang E. Nagel, Yury Oleynik, Peter Philippen, Pavel Saviankou, Dirk Schmidl, Sameer S. Shende, Ronny Tschüter, Michael Wagner, Bert Wesarg, Felix Wolf: Score-P – A Joint Performance Measurement Run-Time Infrastructure for Periscope, Scalasca, TAU, and Vampir. In Tools for High Performance Computing 2011, Proc. of the 5th Parallel Tools Workshop, Dresden, Germany, September 2011, pages 79–91, Springer, 2012.
PDF DOI BibTeX
Zoltán Szebenyi: Capturing Parallel Performance Dynamics. PhD thesis, RWTH Aachen University, volume 12 of IAS Series, Forschungszentrum Jülich, 2012, ISBN 978-3-89336-798-6.
URL BibTeX
Christian Rössel, Bernd Mohr, Michael Gerndt, Felix Wolf: Performance Dynamics of Massively Parallel Codes. Innovatives Supercomputing in Deutschland (inSiDE), 10(2):72–73, 2012.
PDF URL BibTeX
David Böhme, Marc-André Hermanns, Felix Wolf: Scalasca. In Entwicklung und Evolution von Forschungssoftware, Rolduc, November 2011, volume 14 of Aachener Informatik-Berichte, Software Engineering, pages 43–48, Shaker, 2012.
BibTeX
Christian Rössel, Bernd Mohr, Felix Wolf: Score-P. In Entwicklung und Evolution von Forschungssoftware, Rolduc, Niederlande, November 2011, volume 14 of Aachener Informatik-Berichte, Software Engineering, pages 23–30, Shaker, 2012.
BibTeX
Daniel Lorenz, Peter Philippen, Dirk Schmidl, Felix Wolf: Profiling of OpenMP tasks with Score-P. In Proc. of the 41st International Conference on Parallel Processing Workshops (ICPPW), Workshop on Parallel Software Tools and Tool Infrastructures (PSTI), pages 444–453, September 2012.
PDF DOI BibTeX
Marc-André Hermanns, Markus Geimer, Bernd Mohr, Felix Wolf: Scalable detection of MPI-2 remote memory access inefficiency patterns. Intl. Journal of High Performance Computing Applications (IJHPCA), 26(3):227–236, August 2012.
PDF DOI BibTeX
Alexandru Calotoiu, Christian Siebert, Felix Wolf: Pattern-Independent Detection of Manual Collectives in MPI Programs. In Proc. of the 18th Euro-Par Conference, Rhodes Island, Greece, volume 7484 of Lecture Notes in Computer Science, pages 28–39, Springer, August 2012.
PDF DOI BibTeX
Dirk Schmidl, Peter Philippen, Daniel Lorenz, Christian Rössel, Markus Geimer, Dieter an Mey, Bernd Mohr, Felix Wolf: Performance Analysis Techniques for Task-Based OpenMP Applications. In Proc. of the 8th International Workshop on OpenMP (IWOMP), Rome, Italy, volume 7312 of Lecture Notes in Computer Science, pages 196–209, Berlin / Heidelberg, Springer, June 2012.
PDF DOI BibTeX
David Böhme, Bronis R. de Supinski, Markus Geimer, Martin Schulz, Felix Wolf: Scalable Critical-Path Based Performance Analysis. In Proc. of the 26th IEEE International Parallel and Distributed Processing Symposium (IPDPS), Shanghai, China, pages 1330–1340, IEEE, May 2012.
PDF DOI BibTeX
David Böhme, Markus Geimer, Felix Wolf: Characterizing Load and Communication Imbalance in Large-Scale Parallel Applications. In Proc. of the 26th IEEE International Parallel and Distributed Processing Symposium Workshops and PhD Forum (IPDPSW), Shanghai, China, pages 2538–2541, IEEE, May 2012.
PDF DOI BibTeX
Felix Wolf: Understanding the Formation of Wait States in Parallel Programs. Innovatives Supercomputing in Deutschland (inSiDE), 1(9):94–95, 2011.
URL BibTeX
Felix Wolf: Scalasca. In Encyclopedia of Parallel Computing, pages 1775–1785, Springer, October 2011.
URL BibTeX
Jan Mußler, Daniel Lorenz, Felix Wolf: Reducing the overhead of direct application instrumentation using prior static analysis. In Proc. of the 17th Euro-Par Conference, Bordeaux, France, volume 6852 of Lecture Notes in Computer Science, pages 65–76, Springer, September 2011.
PDF DOI BibTeX
Markus Geimer, Marc-André Hermanns, Christian Siebert, Felix Wolf, Brian J. N. Wylie: Scaling Performance Tool MPI Communicator Management. In Proc. of the 18th European MPI Users' Group Meeting (EuroMPI), Santorini, Greece, volume 6960 of Lecture Notes in Computer Science, pages 178–187, Springer, September 2011.
PDF DOI BibTeX
Marc-André Hermanns, Sriram Krishnamoorthy, Felix Wolf: A Scalable Replay-based Infrastructure for the Performance Analysis of One-sided Communication. In Proc. of the 1st Intl. Workshop on High-performance Infrastructure for Scalable Tools (WHIST), held in conjunction with the International Conference on Supercomputing (ICS), Tucson, AZ, USA, June 2011.
PDF BibTeX
Zoltán Szebenyi, Todd Gamblin, Martin Schulz, Bronis R. de Supinski, Felix Wolf, Brian J. N. Wylie: Reconciling Sampling and Direct Instrumentation for Unintrusive Call-Path Profiling of MPI Programs. In Proc. of the 25th IEEE International Parallel and Distributed Processing Symposium (IPDPS), Anchorage, AK, USA, pages 640–648, IEEE, May 2011.
PDF DOI BibTeX
Brian J. N. Wylie, Markus Geimer: Large-scale performance analysis of PFLOTRAN with Scalasca. In Proc. of the 53rd Cray User Group meeting, Fairbanks, AK, USA, Cray User Group Inc., May 2011.
PDF URL BibTeX
Zoltán Szebenyi, Felix Wolf, Brian J. N. Wylie: Performance Analysis of Long-running Applications. In Proc. of the 25th IEEE International Parallel and Distributed Processing Symposium (IPDPS) PhD Forum, Anchorage, AK, USA, pages 2100–2103, IEEE, May 2011.
PDF DOI BibTeX
Markus Geimer, Felix Wolf, Brian J. N. Wylie, Daniel Becker, David Böhme, Wolfgang Frings, Marc-André Hermanns, Bernd Mohr, Zoltán Szebenyi: Recent Developments in the Scalasca Toolset. In Tools for High Performance Computing 2009, Proc. of the 3rd Parallel Tools Workshop, Dresden, Germany, September 2009, chapter 4, pages 39–51, Springer, 2010.
PDF DOI BibTeX
Brian J. N. Wylie: Improved Scalasca toolset support for performance analysis of Cray XT systems. In HPC-Europa2: Science and Supercomputing in Europe - Research Highlights 2009, pages 67, CINECA Consorzio Interuniversitario, Casalecchio di Reno (Bologna), Italy, 2010.
URL BibTeX
Bernd Mohr, Brian J. N. Wylie, Felix Wolf: Performance measurement and analysis tools for extremely scalable systems. Concurrency and Computation: Practice and Experience, 22(16):2212–2229, 2010, (ISC 2008 Award).
PDF DOI BibTeX
Daniel Becker: Timestamp Synchronization of Concurrent Events. PhD thesis, RWTH Aachen University, volume 4 of IAS Series, Forschungszentrum Jülich, 2010, ISBN 978-3-89336-625-5.
URL DOI BibTeX
Marc-André Hermanns: HPC-Europa2: Science and Supercomputing in Europe research highlights 2009. In HPC-Europa2: Science and Supercomputing in Europe research highlights 2010, pages 101, CINECA Consorzio Interuniversitario, Casalecchio di Reno (Bologna), Italy, 2010.
PDF BibTeX
Brian J. N. Wylie, Markus Geimer, Bernd Mohr, David Böhme, Zoltán Szebenyi, Felix Wolf: Large-scale performance analysis of Sweep3D with the Scalasca toolset. Parallel Processing Letters, 20(4):397–414, December 2010.
PDF DOI BibTeX
David Böhme, Markus Geimer, Felix Wolf, Lukas Arnold: Identifying the root causes of wait states in large-scale parallel applications. In Proc. of the 39th International Conference on Parallel Processing (ICPP), San Diego, CA, USA, pages 90–100, IEEE, September 2010, Best Paper Award.
PDF DOI BibTeX
Daniel Becker, Markus Geimer, Rolf Rabenseifner, Felix Wolf: Synchronizing the Timestamps of Concurrent Events in Traces of Hybrid MPI/OpenMP Applications. In Proc. of IEEE International Conference on Cluster Computing (CLUSTER), Heraklion, Greece, pages 38–47, IEEE, September 2010.
PDF DOI BibTeX
Daniel Lorenz, Bernd Mohr, Christian Rössel, Dirk Schmidl, Felix Wolf: How to reconcile event-based performance analysis with tasking in OpenMP. In Proc. of 6th Int. Workshop of OpenMP (IWOMP), Tsukuba, Japan, volume 6132 of Lecture Notes in Computer Science, pages 109–121, Springer, June 2010.
PDF DOI BibTeX
Brian J. N. Wylie, David Böhme, Wolfgang Frings, Markus Geimer, Bernd Mohr, Zoltán Szebenyi, Daniel Becker, Marc-André Hermanns, Felix Wolf: Scalable performance analysis of large-scale parallel applications on Cray XT systems with Scalasca. In Proc. 52nd Cray User Group Meeting, Edinburgh, Scotland, Cray User Group Incorporated, May 2010.
PDF URL BibTeX
Markus Geimer, Felix Wolf, Brian J. N. Wylie, Erika Ábrahám, Daniel Becker, Bernd Mohr: The Scalasca performance toolset architecture. Concurrency and Computation: Practice and Experience, 22(6):702–719, April 2010.
PDF DOI BibTeX
Brian J. N. Wylie, David Böhme, Bernd Mohr, Zoltán Szebenyi, Felix Wolf: Performance analysis of Sweep3D on Blue Gene/P with the Scalasca toolset. In Proc. 24th International Parallel and Distributed Processing Symposium and Workshops (IPDPS), Atlanta, GA, USA, IEEE, April 2010.
PDF DOI BibTeX
David Böhme, Marc-André Hermanns, Markus Geimer, Felix Wolf: Performance Simulation of Non-blocking Communication in Message-Passing Applications. In Proc. of the 2nd Workshop on Productivity and Performance (PROPER) in conjunction with Euro-Par 2009, Delft, The Netherlands, volume 6043 of Lecture Notes in Computer Science, pages 208–217, Springer, March 2010.
PDF DOI BibTeX
Felix Wolf, David Böhme, Markus Geimer, Marc-André Hermanns, Bernd Mohr, Zoltán Szebenyi, Brian J. N. Wylie: Performance Tuning in the Petascale Era. In Proc. of the John von Neumann Institute for Computing (NIC) Symposium 2010, Juelich, Germany, volume 3 of IAS Series, pages 339–346. Forschungszentrum Jülich, John von Neumann-Institut for Computing, February 2010.
PDF BibTeX
Zoltán Szebenyi, Brian J. N. Wylie, Felix Wolf: Scalasca Parallel Performance Analyses of PEPC. In Proc. of the 1st Workshop on Productivity and Performance (PROPER) in conjunction with Euro-Par 2008, Las Palmas de Gran Canaria, Spain, volume 5415 of Lecture Notes in Computer Science, pages 305–314, Springer, 2009.
PDF DOI BibTeX
Felix Wolf: Performance Tools for Petascale Systems. Innovatives Supercomputing in Deutschland (inSiDE), 7(2):38–39, 2009.
URL BibTeX
Daniel Becker, Rolf Rabenseifner, Felix Wolf, John Linford: Scalable timestamp synchronization for event traces of message-passing applications. Parallel Computing, 35(12):595–607, December 2009.
PDF DOI BibTeX
Zoltán Szebenyi, Felix Wolf, Brian J. N. Wylie: Space-Efficient Time-Series Call-Path Profiling of Parallel Applications. In Proc. of the ACM/IEEE Conference on Supercomputing (SC09), Portland, OR, USA, ACM, November 2009.
PDF DOI BibTeX
Wolfgang Frings, Felix Wolf, Ventsislav Petkov: Scalable Massively Parallel I/O to Task-Local Files. In Proc. of the ACM/IEEE Conference on Supercomputing (SC09), Portland, OR, USA, ACM, November 2009.
PDF DOI BibTeX
Marc-André Hermanns, Markus Geimer, Bernd Mohr, Felix Wolf: Scalable Detection of MPI-2 Remote Memory Access Inefficiency Patterns. In Proc. of the 16th European PVM/MPI Users' Group Meeting (EuroPVM/MPI), Espoo, Finland, volume 5759 of Lecture Notes in Computer Science, pages 31–41, Springer, September 2009.
PDF DOI BibTeX
Markus Geimer, Felix Wolf, Brian J. N. Wylie, Bernd Mohr: A scalable tool architecture for diagnosing wait states in massively parallel applications. Parallel Computing, 35(7):375–388, July 2009.
PDF DOI BibTeX
Markus Geimer, Sameer S. Shende, Allen D. Malony, Felix Wolf: A Generic and Configurable Source-Code Instrumentation Component. In Proc. of the International Conference on Computational Science (ICCS), Baton Rouge, LA, USA, volume 5545 of Lecture Notes in Computer Science, pages 696–705, Springer, May 2009.
PDF DOI BibTeX
Marc-André Hermanns: Trace-based performance simulation of large-scale applications. University of Hagen, May 2009.
PDF URL BibTeX
Daniel Becker, Rolf Rabenseifner, Felix Wolf, John Linford: Replay-based synchronization of timestamps in event traces of massively parallel applications. Scalable Computing: Practice and Experience, 10(1):49–60, March 2009.
PDF URL BibTeX
Marc-André Hermanns, Markus Geimer, Felix Wolf, Brian J. N. Wylie: Verifying Causality Between Distant Performance Phenomena in Large-Scale MPI Applications. In Proc. of the 17th Euromicro International Conference on Parallel, Distributed, and Network-Based Processing (PDP), Weimar, Germany, pages 78–84, IEEE, February 2009.
PDF DOI BibTeX
Brian J. N. Wylie, Markus Geimer, Felix Wolf: Performance measurement and analysis of large-scale parallel applications on leadership computing systems. Scientific Programming, 16(2-3):167–181, 2008.
PDF URL DOI BibTeX
Felix Wolf, Brian J. N. Wylie, Erika Ábrahám, Daniel Becker, Wolfgang Frings, Karl Fürlinger, Markus Geimer, Marc-André Hermanns, Bernd Mohr, Shirley Moore, Matthias Pfeifer, Zoltán Szebenyi: Usage of the SCALASCA Toolset for Scalable Performance Analysis of Large-Scale Parallel Applications. In Tools for High Performance Computing, Proc. of the 2nd Parallel Tools Workshop, Stuttgart, Germany, July 2008, pages 157–167, Springer, 2008.
PDF DOI BibTeX
Ventsislav Petkov: Beiträge zum Wissenschaftlichen Rechnen – Ergebnisse des Gaststudentenprogramms 2008 des John von Neumann-Instituts für Computing, chapter SIONlib - Scalable I/O Library for Native Parallel Access to Binary Files. Forschungszentrum Jülich, Technical Report FZJ-JSC-IB-2008-07, pages 93-105, December 2008.
PDF BibTeX
Daniel Becker, Rolf Rabenseifner, Felix Wolf: Implications of non-constant clock drifts for the timestamps of concurrent events. In Proc. of the IEEE International Conference on Cluster Computing (CLUSTER), Tsukuba, Japan, pages 59–68, IEEE, September 2008.
PDF DOI BibTeX
Daniel Becker, John Linford, Rolf Rabenseifner, Felix Wolf: Replay-based synchronization of timestamps in event traces of massively parallel applications. In Proc. of the International Conference on Parallel Processing Workshops (ICPPW), 1st International Workshop on Simulation and Modelling in Emergent Computational Systems (SMECS), Portland, OR, USA, pages 212–219, IEEE, September 2008.
PDF DOI BibTeX
Daniel Becker, Morris Riedel, Achim Streit, Felix Wolf: Grid-Based Workflow Management for Automatic Performance Analysis of Massively Parallel Applications. In Proc. of the 3rd CoreGRID Workshop on Grid Middleware, Barcelona, Spain of CoreGRID Series, pages 103–118, Springer, June 2008.
PDF DOI BibTeX
Zoltán Szebenyi, Brian J. N. Wylie, Felix Wolf: SCALASCA Parallel Performance Analyses of SPEC MPI2007 Applications. In Proc. of the 1st SPEC International Performance Evaluation Workshop (SIPEW), Darmstadt, Germany, volume 5119 of Lecture Notes in Computer Science, pages 99–123, Springer, June 2008.
PDF DOI BibTeX
Markus Geimer, Felix Wolf, Brian J. N. Wylie, Erika Ábrahám, Daniel Becker, Bernd Mohr: The SCALASCA Performance Toolset Architecture. In International Workshop on Scalable Tools for High-End Computing (STHEC), Kos, Greece, pages 51–65, June 2008.
PDF BibTeX
Oscar Hernandez, Fengguang Song, Barbara Chapman, Jack Dongarra, Bernd Mohr, Shirley Moore, Felix Wolf: Performance Instrumentation and Compiler Optimizations for MPI/OpenMP Applications. In Proc. of the 2nd International Workshop on OpenMP (IWOMP 2006), Reims, France, volume 4315 of Lecture Notes in Computer Science, pages 267–278, Springer, June 2008.
PDF DOI BibTeX
Marc-André Hermanns, Markus Geimer, Felix Wolf, Brian J. N. Wylie: Verifying Causal Connections between Distant Performance Phenomena in Large-Scale Message-Passing Applications. Technical Report FZJ-JSC-IB-2008-05, Forschungszentrum Jülich, April 2008.
PDF BibTeX
Daniel Becker, Wolfgang Frings, Felix Wolf: Performance Evaluation and Optimization of Parallel Grid Computing Applications. In Proc. of the 16th Euromicro International Conference on Parallel, Distributed and Network-Based Processing (PDP), Toulouse, France, pages 193–199, IEEE, February 2008.
PDF DOI BibTeX
Felix Wolf, Daniel Becker, Markus Geimer, Brian J. N. Wylie: Scalable Performance Analysis Methods for the Next Generation of Supercomputers. In Proc. of the John von Neumann Institute for Computing (NIC) Symposium, Jülich, Germany, volume 39 of NIC-Series, pages 315–322, February 2008.
PDF BibTeX
Markus Geimer, Felix Wolf, Andreas Knüpfer, Bernd Mohr, Brian J. N. Wylie: A Parallel Trace-Data Interface for Scalable Performance Analysis. In Proc. of the 8th International Workshop on State-of-the-Art in Scientific and Parallel Computing (PARA), Umeå, Sweden, June 2006, volume 4699 of Lecture Notes in Computer Science, pages 398–408, Springer, 2007.
PDF DOI BibTeX
Brian J. N. Wylie, Felix Wolf, Bernd Mohr, Markus Geimer: Integrated Runtime Measurement Summarisation and Selective Event Tracing for Scalable Parallel Execution Performance Diagnosis. In Proc. of the 8th International Workshop on State-of-the-Art in Scientific and Parallel Computing (PARA), Umeå, Sweden, June 2006, volume 4699 of Lecture Notes in Computer Science, pages 460–469, Springer, 2007.
PDF DOI BibTeX
Christian Bischof, Felix Wolf: Produktivität versus Performanz in der Simulation. RWTH Themen, 2:38–39, 2007.
BibTeX
M. Behbahani, Marek Behr, Christian Bischof, Felix Wolf: Kranken Herzen helfen. RWTH Themen, 1:44–46, 2007.
BibTeX
Daniel Becker, Wolfgang Frings, Felix Wolf: Performance Evaluation and Optimization of Metacomputing Applications. In Proc. of the 3rd Workshop on Communication in Cluster- and Grid-Systems (KiCC, Kommunikation in Clusterrechnern und Clusterverbundsystemen), Aachen, Germany, pages 32–39. RWTH Aachen University, December 2007.
PDF URL BibTeX
John Linford: CESRI 2007 Research Report - Implementation and Validation of the Extended Controlled Logical Clock FZJ-JSC-IB-2007-11, Forschungszentrum Jülich, November 2007.
PDF BibTeX
Markus Geimer, Björn Kuhlmann, Farzona Pulatova, Felix Wolf, Brian J. N. Wylie: Scalable Collation and Presentation of Call-Path Profile Data with CUBE. In Proc. of the Conference on Parallel Computing (ParCo), Aachen/Jülich, Germany, pages 645–652, September 2007, Minisymposium Scalability and Usability of HPC Programming Tools.
PDF BibTeX
Daniel Becker, Rolf Rabenseifner, Felix Wolf: Timestamp Synchronization for Event Traces of Large-Scale Message-Passing Applications. In Proc. of the 14th European PVM/MPI Users' Group Meeting (EuroPVM/MPI), Paris, France, volume 4757 of Lecture Notes in Computer Science, pages 315–325, Springer, September 2007.
PDF DOI BibTeX
Brian J. N. Wylie, Markus Geimer, Mike Nicolai, Markus Probst: Performance analysis and tuning of the XNS CFD solver on BlueGene/L. In Proc. of the 14th European PVM/MPI Users' Group Meeting (EuroPVM/MPI), Paris, France, volume 4757 of Lecture Notes in Computer Science, pages 107–116, Springer, September 2007.
PDF BibTeX
Allen D. Malony, Sameer S. Shende, Alan Morris, Felix Wolf: Compensation of Measurement Overhead in Parallel Performance Profiling. International Journal of High Performance Computing Applications, 21(2):174–194, May 2007.
PDF DOI BibTeX
Brian J. N. Wylie: Scalable performance analysis of large-scale parallel applications on MareNostrum. In Science and Supercomputing in Europe, pages 453-461, CINECA Consorzio Interuniversitario, Casalecchio di Reno (Bologna), Italy, April 2007, Also available as SSCinEurope 2007 CD.
PDF URL BibTeX
Daniel Becker, Felix Wolf, Wolfgang Frings, Markus Geimer, Brian J. N. Wylie, Bernd Mohr: Automatic Trace-Based Performance Analysis of Metacomputing Applications. In Proc. of the International Parallel and Distributed Processing Symposium (IPDPS), Long Beach, CA, USA, IEEE, March 2007.
PDF DOI BibTeX
Felix Wolf, Bernd Mohr, Jack Dongarra, Shirley Moore: Automatic analysis of inefficiency patterns in parallel applications. Concurrency and Computation: Practice and Experience, 19(11):1481–1496, February 2007.
PDF DOI BibTeX
Markus Geimer, Felix Wolf, Brian J. N. Wylie, Bernd Mohr: Scalable Parallel Trace-Based Performance Analysis. Innovatives Supercomputing in Deutschland (inSiDE), 4(2):16–19, 2006.
PDF URL BibTeX
Markus Geimer, Felix Wolf, Brian J. N. Wylie, Bernd Mohr: Scalable Parallel Trace-Based Performance Analysis. In Proc. of the 13th European PVM/MPI Users' Group Meeting (EuroPVM/MPI), Bonn, Germany, volume 4192 of Lecture Notes in Computer Science, pages 303–312, Springer, September 2006.
PDF DOI BibTeX
Andrej Kühnal, Marc-André Hermanns, Bernd Mohr, Felix Wolf: Specification of Inefficiency Patterns for MPI-2 One-sided Communication. In Proc. of the 12th Euro-Par Conference, Dresden, Germany, volume 4128 of Lecture Notes in Computer Science, pages 47–62, Springer, August 2006.
PDF DOI BibTeX
Gaby Aguilera, Patricia J. Teller, Michaela Taufer, Felix Wolf: A Systematic Multi-step Methodology for Performance Analysis of Communication Traces of Distributed Applications based on Hierarchical Clustering. In Proc. of the 5th International Workshop on Performance Modeling, Evaluation, and Organization of Parallel and Distributed Systems (PMEO-PDS, in conjunction with IPDPS 2006), Rhodes Island, Greece, IEEE, April 2006.
PDF DOI BibTeX
Felix Wolf, Felix Freitag, Bernd Mohr, Shirley Moore, Brian J. N. Wylie: Large Event Traces in Parallel Performance Analysis. In Proc. of the 8th Workshop on Parallel Systems and Algorithms (PASA), Frankfurt, Germany, volume P-81 of Lecture Notes in Informatics, pages 264–273, Gesellschaft für Informatik, March 2006.
PDF BibTeX
Felix Wolf, Allen D. Malony, Sameer S. Shende, Alan Morris: Trace-Based Parallel Performance Overhead Compensation. In Proc. of the International Conference on High Performance Computing and Communications (HPCC), Sorrento, Italy, volume 3726 of Lecture Notes in Computer Science, pages 617–628, Springer, September 2005.
PDF DOI BibTeX
Shirley Moore, Felix Wolf, Jack Dongarra, Sameer S. Shende, Allen D. Malony, Bernd Mohr: A Scalable Approach to MPI Application Performance Analysis. In Proc. of the 12th European PVM/MPI Users' Group Meeting (EuroPVM/MPI), Sorrento, Italy, volume 3666 of Lecture Notes in Computer Science, pages 309–316, Springer, September 2005.
PDF DOI BibTeX
Brian J. N. Wylie, Bernd Mohr, Felix Wolf: Holistic Hardware Counter Performance Analysis of Parallel Programs. In Proc. of the Conference on Parallel Computing (ParCo), Malaga, Spain, pages 187–194, September 2005.
PDF BibTeX
Bernd Mohr, Andrej Kühnal, Marc-André Hermanns, Felix Wolf: Performance Analysis of One-sided Communication Mechanisms. In Proc. of the Conference on Parallel Computing (ParCo), Malaga, Spain, September 2005, Minisymposium Performance Analysis.
PDF BibTeX
Marc-André Hermanns, Bernd Mohr, Felix Wolf: Event-based Measurement and Analysis of One-sided Communication. In Proc. of the 11th Euro-Par Conference, Lisboa, Portugal, volume 3648 of Lecture Notes in Computer Science, pages 156–165, Springer, August 2005.
PDF DOI BibTeX
Bernd Mohr, Luiz A. DeRose, Jeffrey S. Vetter: A Performance Measurement Infrastructure for Co-Array Fortran. In Proc. of the 4th Euro-Par Conference, Lisboa, Portugal, volume 3648 of Lecture Notes in Computer Science, pages 156-165, Springer, August 2005.
PDF DOI BibTeX
Nikhil Bhatia, Fengguang Song, Felix Wolf, Bernd Mohr, Jack Dongarra, Shirley Moore: Automatic Experimental Analysis of Communication Patterns in Virtual Topologies. In Proc. of the International Conference on Parallel Processing (ICPP), Oslo, Norway, pages 465–472, IEEE Society, June 2005.
PDF DOI BibTeX
P. Worley, J. Candy, L. Carrington, K. Huck, T. Kaiser, G. Mahinthakumar, Allen D. Malony, Shirley Moore, D. Reed, P. Roth, H. Shan, Sameer S. Shende, A. Snavely, S. Sreepathi, Felix Wolf, Y. Zhang: Performance Analysis of GYRO: A Tool Evaluation. In Proc. of the 2005 SciDAC Conference, San Francisco, CA, USA, June 2005.
PDF BibTeX
Nikhil Bhatia, Shirley Moore, Felix Wolf, Jack Dongarra, Bernd Mohr: A Pattern-Based Approach to Automated Application Performance Analysis. In Workshop on Patterns in High Performance Computing (patHPC 2005), Urbana-Champaign, IL, USA, May 2005.
PDF BibTeX
Shirley Moore, Felix Wolf, Jack Dongarra, Bernd Mohr: Improving Time to Solution with Automated Performance Analysis. In 2nd Workshop on Productivity and Performance in High-End Computing (P-PHEC), San Francisco, CA, USA, February 2005.
PDF BibTeX
Fengguang Song, Felix Wolf: CUBE User Manual ICL-UT-04-01, University of Tennessee, Innovative Computing Laboratory, 2004.
PDF BibTeX
Felix Wolf: EARL - API Documentation ICL-UT-04-03, University of Tennessee, Innovative Computing Laboratory, October 2004.
PDF BibTeX
Felix Wolf, Bernd Mohr, Jack Dongarra, Shirley Moore: Efficient Pattern Search in Large Traces through Successive Refinement. In Proc. of the 10th Euro-Par Conference, Pisa, Italy, volume 3149 of Lecture Notes in Computer Science, pages 47–54, Springer, August 2004.
PDF DOI BibTeX
Fengguang Song, Felix Wolf, Nikhil Bhatia, Jack Dongarra, Shirley Moore: An Algebra for Cross-Experiment Performance Analysis. In Proc. of the International Conference on Parallel Processing (ICPP), Montreal, Canada, pages 63–72, IEEE Society, August 2004.
PDF DOI BibTeX
Philip Mucci, Jack Dongarra, Rick Kufrin, Shirley Moore, Fengguang Song, Felix Wolf: Automating the Large-Scale Collection and Analysis of Performance Data on Linux Clusters. In 5th LCI International Conference on Linux Clusters: The HPC Revolution, Austin, TX, USA, May 2004.
PDF URL BibTeX
Felix Wolf, Bernd Mohr: Automatic performance analysis of hybrid MPI/OpenMP applications. Journal of Systems Architecture, 49(10-11):421–439, November 2003.
PDF DOI BibTeX
Felix Wolf, Bernd Mohr: Hardware-Counter Based Automatic Performance Analysis of Parallel Programs. In Proc. of the Conference on Parallel Computing (ParCo), Dresden, Germany, volume 13 of Advances in Parallel Computing, pages 753–760, Elsevier, September 2003, Minisymposium Performance Analysis.
PDF DOI BibTeX
Felix Wolf, Bernd Mohr: KOJAK - A Tool Set for Automatic Performance Analysis of Parallel Applications. In Proc. of the 9th Euro-Par Conference, Klagenfurt, Austria, volume 2790 of Lecture Notes in Computer Science, pages 1301–1304, Springer, August 2003, Demonstrations of Parallel and Distributed Computing.
PDF DOI BibTeX
Felix Wolf: Automatic Performance Analysis on Parallel Computers with SMP Nodes. PhD thesis, RWTH Aachen, Forschungszentrum Jülich, February 2003, NIC Series Volume 17, ISBN 3-00-010003-2.
URL BibTeX
Felix Wolf, Bernd Mohr: Automatic Performance Analysis of Hybrid MPI/OpenMP Applications. In Proc. of 11th Euromicro Workshop on Parallel Distributed and Network-Based Processing (PDP), Genua, Italy, pages 13–22, IEEE, February 2003.
PDF DOI BibTeX
Bernd Mohr, Allen D. Malony, H. C. Hoppe, F. Schlimbach, G. Haab, J. Hoeflinger, S. Shah: A Performance Monitoring Interface for OpenMP. In Proceedings of Fourth European Workshop on OpenMP (EWOMP), Rome, Italy, September 2002.
PDF BibTeX
Luiz A. DeRose, Felix Wolf: CATCH – A Call-Graph Based Automatic Tool for Capture of Hardware Performance Metrics for MPI and OpenMP Applications. In Proc. of the 8th Euro-Par Conference, Paderborn, Germany, volume 2400 of Lecture Notes in Computer Science, pages 167–176, Springer, August 2002.
PDF DOI BibTeX
Bernd Mohr, Allen D. Malony, Sameer S. Shende, Felix Wolf: Design and Prototype of a Performance Tool Interface for OpenMP. The Journal of Supercomputing, 23(1):105–128, August 2002.
PDF DOI BibTeX
Bernd Mohr, Allen D. Malony, Sameer S. Shende, Felix Wolf: Design and Prototype of a Performance Tool Interface for OpenMP. In 2nd Annual Los Alamos Computer Science Institute Symposium (LACSI), Santa Fe, NM, USA, October 2001.
PDF BibTeX
Felix Wolf, Bernd Mohr: Specifying Performance Properties of Parallel Applications Using Compound Events. Parallel and Distributed Computing Practices, 4(3):301–317, September 2001.
PDF URL BibTeX
Bernd Mohr, Allen D. Malony, Sameer S. Shende, Felix Wolf: Towards a Performance Tool Interface for OpenMP: An Approach based on Directive Rewriting. In 3rd European Workshop on OpenMP (EWOMP), Barcelona, Spain, September 2001.
PDF BibTeX
Thomas Fahringer, Michael Gerndt, Bernd Mohr, G. Riley, J. L. Träff, Felix Wolf: Knowledge Specification for Automatic Performance Analysis FZJ-ZAM-IB-2001-08, ESPRIT IV Working Group APART, Forschungszentrum Jülich, August 2001, Revised version.
PDF BibTeX
K. A. Lindlan, J. Cuny, Allen D. Malony, Bernd Mohr, R. Rivenburgh, C. Rasmussen: A Tool Framework for Static and Dynamic Analysis of Object-Oriented Software with Templates. In Proc. of the Supercomputing Conference (SC2000), Dallas, TX, USA, November 2000.
PDF BibTeX
Felix Wolf, Bernd Mohr: Automatic Performance Analysis of MPI Applications Based on Event Traces. In Proc. of the 6th Euro-Par Conference, Munich, Germany, volume 1900 of Lecture Notes in Computer Science, pages 123–132, Springer, August 2000.
PDF DOI BibTeX
Michael Gerndt, Hans-Georg Eßer: Specification Techniques for Automatic Performance Analysis Tools. In Proc. of the 8th International Workshop on Compilers for Parallel Computers (CPC), Aussois, France. Ecole Normale Supérieure Lyon, January 2000.
PDF BibTeX
Felix Wolf, Bernd Mohr: EARL - A Programmable and Extensible Toolkit for Analyzing Event Traces of Message Passing Programs. In Proc. of the 7th International Conference on High Performance Computing and Networking Europe (HPCN), Amsterdam, The Netherlands, volume 1593 of Lecture Notes in Computer Science, pages 503–512, Springer, April 1999.
PDF DOI BibTeX
Michael Gerndt, Bernd Mohr, Felix Wolf, Mario Pantano: Performance Analysis on Cray T3E. In Proc. of the 7th Euromicro Workshop on Parallel and Distributed Processing (PDP), Funchal, Madeira, Portugal, pages 241–248, IEEE, February 1999.
PDF URL BibTeX
Michael Gerndt, Bernd Mohr, Mario Pantano, Felix Wolf: Automatic Performance Analysis for Cray T3E. In Proc. of the 7th Workshop on Compilers for Parallel Computers (CPC), University of Linköping, Sweden, pages 69–78, June 1998.
BibTeX