DoE SciDAC University of Tennessee

Optimizing Performance and Enhancing Functionality of Distributed Applications using Logistical Networking

Micah Beck, Jack Dongarra, James Plank, Rich Wolski

Communication is the foundation for all collaboration, and accordingly the growth of the kind distributed collaboration envisioned by the SciDAC Program has closely tracked the development of high-performance networking. Sustained collaboration, however, inevitably involves communication with both synchronous and asynchronous elements. While current forms of networking address the former, the latter requires uses of storage that are not part of the current network model. Applications that need significant amounts of non-archival, "working storage" for distributed state management, latency hiding, and exposed buffer management, are not well served by traditional models of file and database storage services. By providing best-effort control over substantial shared, untrusted state management resources (RAM and disk), at network intermediate nodes and at shared edge locations, our project on Logistical Networking (LoN) addresses these needs by implementing communication services that IP networking alone have not been able to provide.

As the name suggests, the goal of Logistical Networking is to bring data transmission and storage within one framework, much as military or industrial logistics treat transportation lines and storage depots as coordinate elements of one infrastructure. Achieving this goal requires a form of network storage that is globally scalable, meaning that the storage systems attached to the global data transmission network can expand beyond organizational and geographical boundaries and can grow large enough to be global in scope.

The approach to this problem we developed for SciDAC is based upon an exposed approach to computer service architecture, in which one offers client software a primitive service the semantics of which are closely based on the underlying physical infrastructure. Higher-level tools and protocols with more abstract semantics running on clients compose these primitive services to implement more sophisticated run-time algorithms for applications. In this way the infrastructure of LoN follows the exposed approach embodied in IP datagram service. LoN uses scalable storage resources in the network to model communication in both its synchronous and asynchronous aspects; its fundamental service is the movement of data between buffers on adjacent nodes.

The Internet Backplane Protocol (IBP) supplies LoN with its basic mechanism. Itimplements scalable management of storage in the network using shared physical resources. IBP is implemented by intermediate nodes, called "depots," that offer standard storage operations, including allocate, load, store and third party copy. From a networking perspective, an IBP depot can be viewed as a kind of router that exposes buffers to clients, thereby enabling the implementation of flexible network services, such as overlay multicast with explicit specification of the delivery tree by the source.

To create a shared resource fabric that exposes network storage for general use in this way, we defined a new storage stack (Figure 1), analogous to the Internet stack, using a bottom-up and layered design approach that adheres to the end-to-end principles. IBP is the lowest layer of the storage stack that is globally accessible from the network. Its design is modeled on the design of IP datagram delivery. Just as IP is a more abstract service based on link-layer datagram delivery, so IBP is a more abstract service based on blocks of data (on disk, memory, tape or other media) that are managed as "byte arrays." By masking the details of the storage at the local level - fixed block size, differing failure modes, local addressing schemes - this byte array abstraction allows a uniform IBP model to be applied to storage resources generally. The use of IP networking to access IBP storage resources creates a globally accessible storage service.

As the case of IP shows, however, in order to scale globally the service guarantees that IBP offers must be weakened, i.e. it must present a "best effort" storage service. First and foremost, this means that IBP storage allocations can be time limited. When the lease on an IBP allocation expires, the storage resource can be reused and all data structures associated with it can be deleted. Additionally an IBP allocation can be refused by a storage resource in response to over-allocation, much as routers can drop packets, and such "admission decisions" can be based on both size and duration. Enforcing time limits puts transience into storage allocation, giving it some of the fluidity of datagram forwarding; more importantly, it makes network storage far more sharable, and creates an infrastructure that can scale.

The semantics of IBP storage allocation also assume that an IBP storage resource can be transiently unavailable. Since the user of remote storage resources depends on so many uncontrolled, remote variables, it may be necessary to assume that storage can be permanently lost. Thus, IBP is a "best effort" storage service. To encourage the sharing of idle resources, IBP even supports "soft" storage allocation semantics, where allocated storage can be revoked at any time. In all cases such weak semantics mean that the level of service must be characterized statistically.

Like the higher levels of the traditional network stack, the higher levels of the network storage stack - exNode, Logistical Runtime System (LoRS), Logistical Backbone (L-bone, and the Logistical Files System - implement abstractions and services necessary to aggregate and utilize the generic best effort storage service provided by IBP. A discussion of each of these layers is integrated into the account of our work on Logistical Networking for the SciDAC research community presented below.

For further information please view our SciDAC Project Review (PDF).

 

 

This project is supported by the Department of Energy Scientific Discovery through Advanced Computing program under grant #DE-FC02-01ER25465.

 

Meetings and Activities:

SC 2003

"The Livny and Plank-Beck Problems: Studies in Data Movement on the Computational Grid" (submitted)

"Remote Visualization by Browsing Image Based Databases with Logistical Networking" (submitted)

Phoenix, AZ; Nov 15-21, 2003
http://www.sc-conference.org/sc2003/

 

16th APAN Meeting

Asia-Pacific Advanced Network Consortium Meeting, Logistical Networking Session

"Introduction to Logistical Networking" (to appear)

Busan, Korea; Aug 28, 2003
http://apan.net/meetings/busan03/ cs-logistics.htm#abs1

 

ACM SIGCOMM 2003

"An End-to-End Approach to Globally Scalable Programmable Networking" (to appear at FDNA-03, Workshop on Future Directions in Network Architecture)

Karlsruhe, Germany; Aug 27, 2003
http://www.acm.org/sigcomm/
sigcomm2003

 

SciDAC 2003 Review Meeting

"SciDAC Review: Optimizing Performance and Enhancing Functionality of Distributed Applications Using Logistical Networking" (to appear)

Argonne, IL; Aug 18-20, 2003
http://www-fp.mcs.anl.gov/middleware-review/

 

CCGrid 2003

"An Exposed Approach to Reliable Multicast in Heterogeneous Logistical Networks" (at GAN'03, Grids and Advanced Networks Workshop)

Tokyo, Japan; May 12-15, 2003
http://ccgrid2003.apgrid.org/

 

2003 PI Meeting

Napa, CA; March 10-11, 2003 http://www.nersc.gov/conferences/ SciDAC2003

 

ESCC/Internet2 Joint Techs Meeting

Presentation and Poster Session (at the Terascale Supernova Initiative Collaboration Meeting)

Miami, FL; Feb 2-6, 2003 http://events.internet2.edu/2003/
ESCC2-03/ESCC2-03-index.html

 

ASIAN'02

Asian Computing Science Conference 2002

"The Logistical Backbone: Scalable Infrastructure for Global Data Grids"

Hanoi, Vietnam; Dec 4-6, 2002
http://www.lirmm.fr/asian02/

 

SC 2002

"Logistical Networking: the Logistical Runtime System" (poster presentation)

"Live Demonstration of Logistical Networking" (QuickTime movie)

Baltimore, MD; Nov 16-22, 2002
http://www.sc-2002.org/

 

iGrid 2002

"Video IBPster" (Flyer distributed during live demonstration)

"Video IBPster Demostration" (Recording of live demonstration)

Amsterdam, The Netherlands; Sept 23-26, 2002
http://www.igrid2002.org/

 

ACM SIGCOMM 2002

"An End-to-End Approach to Globally Scalable Network Storage"

Pittsburgh, PA; Aug 19-23, 2002
http://www.acm.org/sigcomm/
sigcomm2002/

 

IC 2002

The 3rd International Conference on Internet Computing

"Logistical Storage in Active Networking: a promising framework for network services"

Las Vegas, NV; June 24-27, 2002
http://www.informatik.uni-trier.de/
~ley/db/conf/ic/ic2002-3.html

 

CCGrid 2002

The 2nd IEEE/ACM International Symposium on Cluster Computing and the Grid

"The Internet Backplane Protocol: A Study in Resource Sharing"

"Logistical Networking: When Institutions Peer" (at the 2nd International Workshop on Global and Peer-to-Peer Computing on Large Scale Distributed Systems)

Berlin, Germany; May 21-24, 2002
http://ccgrid2002.zib.de/

 

FTPDS 02

IEEE Annual Workshop on Fault Tolerant Parallel and Distributed Systems, in conjunction with the International Parallel and Distributed Processing Symposium

"Fault Tolerance in the Network Storage Stack"

Ft Lauderdale, FL; April 15-19, 2002
http://www.ece.neu.edu/groups/ netcom/FTPDS02_program.html

 

FAST '02

USENIX Conference on File and Storage Technologies

"Logistical Networking Research and the Network Storage Stack" (Work in Progress report)

Monterey, CA; Jan 28-30, 2002
http://www.usenix.org/publications/ library/proceedings/fast02/

 

SciDAC Kickoff Meeting

"Logistical Networking: the Internet Backplane Protocol" (poster presentation)

"Logistical Networking" (Data Management panel presentation)

Reston, VA; Jan 15-16, 2002
http://www-unix.mcs.anl.gov/
discovery/kickoff.html