Enabling Scalable Data Processing and Management through Standards-based Job Execution and the Global Federated File System

Memon, Shahbaz; Riedel, Morris; Memon, Riedel,; Koeritz, Chris; Grimshaw, Andrew; Neukirchen, Helmut

Opin vísindi
→
Háskóli Íslands
→
Greinar- HÍ
→
View Item

Enabling Scalable Data Processing and Management through Standards-based Job Execution and the Global Federated File System

Title:	Enabling Scalable Data Processing and Management through Standards-based Job Execution and the Global Federated File System
Author:	Memon, Shahbaz Riedel, Morris Memon, Riedel, Koeritz, Chris Grimshaw, Andrew Neukirchen, Helmut 6 authors Show fewer authors
Date:	2016-05-01
Language:	English
Scope:	115-128
University/Institute:	Háskóli Íslands University of Iceland
School:	Verkfræði- og náttúruvísindasvið (HÍ) School of Engineering and Natural Sciences (UI)
Department:	Iðnaðarverkfræði-, vélaverkfræði- og tölvunarfræðideild (HÍ) Faculty of Industrial Eng., Mechanical Eng. and Computer Science (UI)
Series:	Scalable Computing: Practice and Experience;17(2)
ISSN:	1895-1767
DOI:	10.12694/scpe.v17i2.1160
Subject:	Statistical data mining; Data processing,; Distributed file system; Gagnavinnsla; Gagnanám; Skráning gagna
URI:	https://hdl.handle.net/20.500.11815/184
Show full item record
Citation: Shahbaz Memon, Morris Riedel, Shiraz Memon, Chris Koeritz, Andrew Grimshaw, Helmut Neukirchen. (2016). Enabling Scalable Data Processing and Management through Standards-based Job Execution and the Global Federated File System. Scalable Computing: Practice and Experience, 17(2). 115-128. DOI: http://dx.doi.org/10.1051/kmae/2011046
Abstract: Emerging challenges for scientific communities are to efficiently process big data obtained by experimentation and computational simulations. Supercomputing architectures are available to support scalable and high performant processing environment, but many of the existing algorithm implementations are still unable to cope with its architectural complexity. One approach is to have innovative technologies that effectively use these resources and also deal with geographically dispersed large datasets. Those technologies should be accessible in a way that data scientists who are running data intensive computations do not have to deal with technical intricacies of the underling execution system. Our work primarily focuses on providing data scientists with transparent access to these resources in order to easily analyze data. Impact of our work is given by describing how we enabled access to multiple high performance computing resources through an open standards-based middleware that takes advantage of a unified data management provided by the the Global Federated File System. Our architectural design and its associated implementation is validated by a usecase that requires massivley parallel DBSCAN outlier detection on a 3D point clouds dataset.
Rights: Open Access

Files in this item

Name: 1160-1099-1-PB.pdf

Size: 644.4Kb

Format: PDF

Description: Publisher´s version

View/Open

This item appears in the following Collection(s)

Greinar- HÍ
Articles

Search Opin vísindi

Browse

All of Opin vísindi
- Communities & Collections
- Authors
- Titles
- Subjects
- Departments
- School
- DOI
- Journal title
This Collection
- Authors
- Titles
- Subjects
- Departments
- School
- DOI
- Journal title

Enabling Scalable Data Processing and Management through Standards-based Job Execution and the Global Federated File System

Enabling Scalable Data Processing and Management through Standards-based Job Execution and the Global Federated File System

Citation:

Abstract:

Rights:

Files in this item

This item appears in the following Collection(s)

Search Opin vísindi

Browse

All of Opin vísindi

This Collection

About

My Account