Using the Intel® SDK for OpenCL™ Applications XE 2013 with the Intel® Xeon Phi™ Coprocessor

Chapter 1 – Introduction

This document is designed to help users get started writing code and running OpenCL* 1.2 applications using the Intel® SDK for OpenCL™ Applications for Linux* on a system that includes the Intel Xeon Phi Coprocessor.

More specifically, the Intel SDK for OpenCL Applications XE 2013 for Linux used in this whitepaper is version 3.0.67279. The SDK supports both the Intel® Xeon server and Intel Xeon Phi coprocessor.

1.1 – Overview

Open Computing Language (OpenCL*) is an open standard for general-purpose parallel programing of heterogeneous systems. The OpenCL* specification is ratified by the Khronos* group at http://khronos.org/opencl .

The Intel® SDK for OpenCL* Applications XE 2013 for Linux is based on the published OpenCL* 1.2 Khronos Specification.

It currently supports the Intel® C++ Compiler for Linux OS. Users can write their code in C and C++. Users can obtain the SDK from http://software.intel.com/en-us/vcsource/tools/opencl-sdk-xe#download . The Release Notes can be found on the website, under the tag “Product Documents”: http://software.intel.com/en-us/vcsource/tools/opencl-sdk-xe .

1.2 – Compatibility

The Intel® SDK for OpenCL™ supports the following Linux operating systems:

- Red Hat Enterprise Linux 64-bit 6.1 kernel 2.6.32-131 (64-bit version)

- SUSE* Linux Enterprise Server 11 SP1 kernel 2.6.32.12-0.7-default (64-bit version)

Depending on the Intel® Many-core Platform Software Stack (MPSS) running on the above platforms, you need to use the correct compiler (Intel® Composer XE 2013 for Linux OS).

The table below summarizes the versions that are supported.

SDK Version	MPSS Version	Compiler Version
3.0.67279	2.1.6720-13	2013.4.183
3.0.64133	2.1.4982-15 or 2.1.5889-16	2013.3.163
3.0.56860	2.1.4982-15	2013.2.146

Table 1: Intel OpenCL SDK Compatability.

Note that the Intel OpenCL SDK for Linux OS supports multiple Intel Xeon Phi coprocessors.

The first part of this whitepaper shows how to install the Intel SDK for OpenCL on a Linux OS. The second part shows how to run an OpenCL* sample code on an Intel Xeon Phi Coprocessor.

Chapter 2 – Installing Intel SDK for OpenCL Applications XE 2013 for Linux

To start, you must have installed the latest version of the Intel C/C++ Compiler as well as the Intel MPSS. In this paper, the Intel C/C++ Composer XE 2013 update 4 and the Intel MPSS Gold Update 3 are used. The Installation Notes document is available in the main page under the tag “Product Documents”: http://software.intel.com/en-us/vcsource/tools/opencl-sdk-xe

You can purchase these software development tools from http://software.intel.com/en-us/linux-tool-suites. These instructions assume that you downloaded the Intel® OpenC*L SDK and have the intel_sdk_for_ocl_applications_2013_xe_sdk_3.0.67279_x64 file.

Also, before installing the Intel SDK for OpenCL, please uninstall previous installations of the SDK that are older than Intel SDK for OpenCL Applications XE 2013 beta. You must have root permissions to successfully install the SDK.

Once you have acquired a copy of the tools, extract the tar file for Intel SDK for OpenCL. This package contains the OpenCL* C header files, development tools and, the OpenCL* runtime and compiler for Linux operating systems.

# tar -xvf intel_sdk_for_ocl_applications_2013_xe_sdk_3.0.67279_x64.tgz
# cd intel_sdk_for_ocl_applications_2013_xe_sdk_3.0.67279_x64

To install the runtime for CPU as well as the Intel Xeon Phi coprocessor run the install-cpu+mic.sh script as root. The runtime will be installed into the /opt/intel/opencl-1.2-3.0.67279 directory.

# sudo ./install-cpu+mic.sh

Alternately, you can use the RPM package manager to install the runtime. To install the runtime for the CPU as well as the coprocessor, run the following command:

For RedHat Enterprise Linux OS:

# sudo yum install *base*.rpm *intel-cpu*.rpm *intel-mic*.rpm

For SUSE Linux Enterprise Server OS:

# sudo zypper install *base*.rpm *intel-cpu*.rpm *intel-mic*.rpm

To install the developer tools along with the runtimes, execute the following command:

For RedHat Enterprise Linux OS:

# sudo yum install *base*.rpm *intel-cpu*.rpm *intel-mic*.rpm *devel*.rpm

For SUSE Linux Enterprise Server OS:

# sudo zypper install *base*.rpm *intel-cpu*.rpm *intel-mic*.rpm *devel*.rpm

After successfully installing these two packages, you should see the following directories under opencl-1.2-3.0.67279:

# ls /opt/intel/opencl-1.2-3.0.67279
bin
doc
etc
include
lib64
libmic

Note that /lib64 and /limbic are the runtime libraries for CPU and coprocessors respectively.

Chapter 3 – Compiling and Running a Sample OpenCL* Program

This section includes a sample OpenCL* program written in C. We will show how to compile and run the program for the Intel Xeon Phi Coprocessor.

The sample program is an implementation of Gaussian Kernel Smoothing or Gaussian Smoothing, which is most commonly used in image processing. Gaussian smoothing tries to reduce the level of noise in an image, thereby making image processing algorithms more robust against noise. This technique finds application in fields such as medical imaging, graphics software and computer vision.

The Gaussian Kernel Smoothing being an image processing application possesses a high of degree of parallelism and hence is well suited for the Intel Xeon Phi coprocessor. Also, the operations in this algorithm closely resemble a 5-point 2D stencil operation which is commonly used in high performance computing. These two traits make this application an excellent candidate as an OpenCL application running on the Intel Xeon Phi coprocessor.

Gaussian smoothing uses a Gaussian function for calculating the transformation to apply to each pixel in an image. A Gaussian function in one dimension has the form:

Similarly, the Gaussian function in two dimensions

Where σis the standard deviation of the Gaussian distribution, x is the distance from the origin in the horizontal axis and y is the distance from the origin in the vertical axis. Using the values from this distribution, a weight matrix, also called a convolution kernel, is created and then applied to the image. For each pixel in the in the image, the pixel’s new values is calculated as a weighted mean of the pixel’s neighborhood.

The weight matrix is such that the weights are inversely proportional to the distance from the center pixel. The original pixel’s value receives the heaviest weight and neighboring pixels receive smaller weights as their distance to the original pixel increases. For this implementation, we use the following weight matrix or convolution kernel.

1/16	2/16	1/16
2/16	4/16	2/16
1/16	2/16	4/16

Table 2: Weight Matrix

In this implementation, the host divides the work between the available Intel Xeon Phi coprocessors such that each coprocessor works equal sets of rows in the image. Each coprocessor applies the Gaussian smoothing to each pixel assigned to it. To deal with the boundaries of the input image, the implementation pads the edges of the input image by duplicating the edge pixels. As with any OpenCL application, the host is charged with setting up the OpenCL platform, choosing devices and cleaning up after the execution is completed.

The pseudo code is shown below:

       Host initializes the OpenCL Platform and selects the OpenCL devices. 
       Host reads an NxN input image
       For each device
              Transfer (N/#devices) rows of input image from host to device
       For each device
              For each pixel assigned to the device
                     Set sum = 0
                     For original pixel and all 8 neighbors
                           Calculate sum = sum + weight of pixel * pixel value
                     Calculate sum = sum/9
                     Store the sum in corresponding pixel in output image.
       For each device
              Transfer (N/#devices) rows of output image from device to host
       Host writes a NxN output image to file
       Host cleans up.

Before compiling the program, called ocl_sample.c, you need to establish the proper environment settings for the Intel C++ Compiler for the coprocessor.

 # source /opt/intel/composerxe/bin/compilervars.sh intel64

Build the application ocl_sample.out for the coprocessor.

 # icc ocl_sample.cpp -lOpenCL -oocl_sample.out

To run the application on the coprocessor, simply execute the binary, the way you would execute the binary on a Linux system.

 # ./ocl_sample.out 
Platform: Intel(R) OpenCL
Number of accelerators found: 2
 
DEVICE #0:
NAME:Intel(R) Many Integrated Core Acceleration Card
#COMPUTE UNITS:240
 
DEVICE #1:
NAME:Intel(R) Many Integrated Core Acceleration Card
#COMPUTE UNITS:240
 
Compilation started
Compilation done
Linking started
Linking done
Build started
Kernel <smoothing_kernel> was successfully vectorized
Done.
OpenCL Initialization Completed
Completed reading Input Image: 'input.pgm'
Transferring Data from Host to Device
Executing Kernel on selected devices
Transferring data from Device to Host
Completed writing Output Image: 'output.pgm'
Completed execution! Cleaning Up.

This binary expects a grayscale input image named as `input.pgm`. On successful, execution, an output file name ‘output.pgm’ is created. As evident from the file extension, the input image file should be present in the Portable GrayMap (PGM) format.

The input and output images for a sample run are shown below. The edges in the output image have been blurred; however, the regions of the image with constant gray scale values remain unchanged. Thus, as expect, Gaussian Kernel Smoothing blurs edges in the input image.

Input Image (left) and corresponding Output Image (right)

Figure 1: Input Image (left) and corresponding Output Image (right)

Chapter 4: Tools and Resources

Several tools are available to aid the developer in building OpenCL* applications running on the Intel Xeon Phi coprocessor. For example, the Kernel Builder provided with the Intel SDK for OpenCL*, enables you to build and analyze OpenCL* kernels. The tools provide full offline OpenCL* compilation, and include the OpenCL* syntax checker, cross-hardware platform compiler, Low Level Virtual Machine (LLVM) viewer, assembly code viewer, and intermediate program binary generator. To find more about the Kernel Builder please visit this (http://software.intel.com/sites/landingpage/opencl/user-guide-2013/index.htm#Using_the_Intel_SDK_for_OpenCL_Offline_Compiler_Standalone_Tool.htm) article. Also, Intel® VTune™ Amplifier XE can be used to analyze OpenCL* applications running on the Intel Xeon Phi coprocessor. To find more about profiling OpenCL* applications running on the Intel Xeon Phi coprocessor, please visit this (http://software.intel.com/en-us/articles/performance-tuning-of-opencl-applications-on-intel-xeon-phi-coprocessor-using-intel-vtune-amplifier-xe-2013) article.

There is wealth of information available to the developer in the form of guide, articles and blogs. The following is a small subset of documents that might be particularly helpful for developing OpenCL* codes running on the Intel Xeon Phi coprocessor.

Intel SDK for OpenCL* applications 2013 XE – user’s Guide : http://software.intel.com/sites/products/documentation/ioclsdk/2013XE/UG/index.htm
OpenCL* Design and Programming Guide for the Intel Xeon Phi coprocessor: http://software.intel.com/en-us/articles/opencl-design-and-programming-guide-for-the-intel-xeon-phi-coprocessor
Tutorial: Optimizing OpenCL applications for Intel® Xeon Phi™ Coprocessor http://software.intel.com/en-us/articles/workshop-optimizing-opencl-applications-for-intel-xeon-phi-coprocessor
Optimization Guide: http://software.intel.com/sites/products/documentation/ioclsdk/2013XE/OG/index.htm

About the Authors

Sumedh Naik received a Bachelor’s degree in Electronics Engineering from Mumbai University, India in 2009 and a Master’s degree in Computer Engineering from Clemson University in December 2012. He joined Intel in 2012 and been working as an Software Engineer, focusing on developing collateral for Intel® Xeon Phi™ coprocessor.

Loc Q Nguyen received an MBA from University of Dallas, a master’s degree in Electrical Engineering fromMcGill University, and a bachelor's degree in Electrical Engineering fromÉcole Polytechnique de Montréal. He is currently a software engineer with Intel Corporation's Software and Services Group. His areas of interest include computer networking, computer graphics, and parallel processing.

Notices

OpenCL and the OpenCL logo are trademarks of Apple Inc. used by permission by Khronos

INFORMATION IN THIS DOCUMENT IS PROVIDED IN CONNECTION WITH INTEL PRODUCTS. NO LICENSE, EXPRESS OR IMPLIED, BY ESTOPPEL OR OTHERWISE, TO ANY INTELLECTUAL PROPERTY RIGHTS IS GRANTED BY THIS DOCUMENT. EXCEPT AS PROVIDED IN INTEL'S TERMS AND CONDITIONS OF SALE FOR SUCH PRODUCTS, INTEL ASSUMES NO LIABILITY WHATSOEVER AND INTEL DISCLAIMS ANY EXPRESS OR IMPLIED WARRANTY, RELATING TO SALE AND/OR USE OF INTEL PRODUCTS INCLUDING LIABILITY OR WARRANTIES RELATING TO FITNESS FOR A PARTICULAR PURPOSE, MERCHANTABILITY, OR INFRINGEMENT OF ANY PATENT, COPYRIGHT OR OTHER INTELLECTUAL PROPERTY RIGHT.

A "Mission Critical Application" is any application in which failure of the Intel Product could result, directly or indirectly, in personal injury or death. SHOULD YOU PURCHASE OR USE INTEL'S PRODUCTS FOR ANY SUCH MISSION CRITICAL APPLICATION, YOU SHALL INDEMNIFY AND HOLD INTEL AND ITS SUBSIDIARIES, SUBCONTRACTORS AND AFFILIATES, AND THE DIRECTORS, OFFICERS, AND EMPLOYEES OF EACH, HARMLESS AGAINST ALL CLAIMS COSTS, DAMAGES, AND EXPENSES AND REASONABLE ATTORNEYS' FEES ARISING OUT OF, DIRECTLY OR INDIRECTLY, ANY CLAIM OF PRODUCT LIABILITY, PERSONAL INJURY, OR DEATH ARISING IN ANY WAY OUT OF SUCH MISSION CRITICAL APPLICATION, WHETHER OR NOT INTEL OR ITS SUBCONTRACTOR WAS NEGLIGENT IN THE DESIGN, MANUFACTURE, OR WARNING OF THE INTEL PRODUCT OR ANY OF ITS PARTS.

Intel may make changes to specifications and product descriptions at any time, without notice. Designers must not rely on the absence or characteristics of any features or instructions marked "reserved" or "undefined". Intel reserves these for future definition and shall have no responsibility whatsoever for conflicts or incompatibilities arising from future changes to them. The information here is subject to change without notice. Do not finalize a design with this information.

The products described in this document may contain design defects or errors known as errata which may cause the product to deviate from published specifications. Current characterized errata are available on request.

Contact your local Intel sales office or your distributor to obtain the latest specifications and before placing your product order.

Copies of documents which have an order number and are referenced in this document, or other Intel literature, may be obtained by calling 1-800-548-4725, or go to: http://www.intel.com/design/literature.htm

This sample source code is released under the Intel Sample Source Code Agreement located at http://software.intel.com/en-us/articles/intel-sample-source-code-license-agreement/

Intel, the Intel logo, Cilk, Xeon and Intel Xeon Phi are trademarks of Intel Corporation in the U.S. and other countries.

*Other names and brands may be claimed as the property of others

Optimization Notice

Intel's compilers may or may not optimize to the same degree for non-Intel microprocessors for optimizations that are not unique to Intel microprocessors. These optimizations include SSE2, SSE3, and SSE3 instruction sets and other optimizations. Intel does not guarantee the availability, functionality, or effectiveness of any optimization on microprocessors not manufactured by Intel.

Microprocessor-dependent optimizations in this product are intended for use with Intel microprocessors. Certain optimizations not specific to Intel microarchitecture is reserved for Intel microprocessors. Please refer to the applicable product User and Reference Guides for more information regarding the specific instruction sets covered by this notice.

Notice revision #20110804

Using the Intel® SDK for OpenCL™ Applications XE 2013 with the Intel® Xeon Phi™ Coprocessor

Chapter 1 – Introduction

1.1 – Overview

1.2 – Compatibility

Chapter 2 – Installing Intel SDK for OpenCL Applications XE 2013 for Linux

Chapter 3 – Compiling and Running a Sample OpenCL* Program

Chapter 4: Tools and Resources

About the Authors

Notices

Trending Articles

Practice Sheet of Right form of verbs for HSC Students

Download: FK ft Shenky – Nakuyewa ”Prod by: Shenky”

How to win at Markstrat (Markstrat Tips and Tricks) – Vodites

Ominde Commission Report and Recommendations – Ominde Report of 1964

Bureau of Internal Revenue: Regional Offices (Directory)

GO 53 on Enhancement of Ex-gratia upto 5 Lakhs Toddy Tappers in Telangana

Cakewalk CA-2A Leveling Amplifier v2.0.1.97 WiN, v2.0.1.96 OSX Incl Keygen

Mp3 Download: Mdu - Kunjenjenjena

How the kill the job , when DTP request running for long hours.

Microsoft Intune から展開しているアプリのアップデートについて

18-year-old girl was beaten for half an hour by two Northampton men in 'an...

Car crash in Dunton Bassett leaves driver in critical condition

Macky 2, Two Others In Road Accident

Application log 00000000000000089514: Could not convert queue DLVST90CLNT

Detroit mafia: D’Anna Brothers agree to plea deal

Delivery block field greyed out using VA02

Muloraki Au

【個人撮影】スマホのプライベート映像♪「中に出さないで///」カラオケ屋での生ハメ撮りが流出ｗ【リベンジポルノ】＠PornHub

BREAKING NEWS: Diamond Platnumz Is Reported Dead After Ghastly Car Accident

FIAT 500 B0111 B0112