Task-Based Programming in Windows
Setting up a reliable parallel execution context so that programs can be converted into a series of independent executable tasks.
View ArticleConcurrent Programming with Chain Locking
Concurrent access to trees and lists requires carefully managed fine-grained locking. Here's a generic solution in C# that removes many of the typical problems.
View ArticleGetting to 1 Teraflop on the Intel Phi Coprocessor
The key to truly high performance with the Phi coprocessor is to express sufficient parallelism and vector capability to fully utilize the device. Here is a timing framework that enables you to measure...
View ArticleNumerical and Computational Optimization on the Intel Phi
How tuning functions for large data sets and profiling the results gets most of the benefits of the Phi's 60 cores without hand wringing and late-night hacking.
View ArticleProgramming the Xeon Phi
A series of articles on getting the best performance out of the new Intel Xeon Phi coprocessor
View ArticleThe Quiet Revolution in Programming
During the last two years, one of the longest eras in programming has quietly drawn to a close.
View ArticleExceeding Supercomputer Performance with Intel Phi
Using MPI on inexpensive clusters of Intel Xeon Phi coprocessors can produce results that exceed the performance of today's high-end supercomputers.
View ArticleScala for C# Developers: A Tutorial
If you work with C#, you have already mixed object-oriented code with some aspects of functional programming. Why not master Scala?
View ArticleScala for C# Developers: Useful Features
Scala's immutable values and mutable variables, classes and constructors, and its use of operators as method names.
View ArticleScala for C# Developers: The Magic
Implicit conversions, avoiding nulls, default and named parameters, and using mixins and traits to borrow functionality from other classes are all part of what makes Scala a magically powerful language.
View ArticleDebugging Multithreaded Applications in Windows
End the frustration of tracking down thread-specific bugs with a few simple options in Visual Studio.
View ArticleBuilding Web Apps with Lift: Lazy Loading, Push, and Parallel Execution
Lift code is easy to read and maintain. Lazy loading, parallel execution, simple push mechanics, and REST support are just a few of Lift's stand-out benefits.
View ArticleAtomic Operations and Low-Wait Algorithms in CUDA
Used correctly, atomic operations can help implement a wide range of generic data structures and algorithms in the massively threaded GPU programming environment. However, incorrect usage can turn...
View ArticleContinuous Delivery: The First Steps
Continuous delivery integrates many practices that in their totality might seem daunting. But starting with a few basic steps brings immediate benefits. Here's how.
View ArticleCUDA: Unifying Host/Device Interactions with a Single C++ Macro
A general method to move data transparently between the host and the CUDA device.
View ArticleAndroid on x86: Understanding Android Device Emulation
Building customizable emulators and setting up the appropriate environments for developing applications.
View ArticleA Robust Histogram for Massive Parallelism
Preserving highly parallel performance when every thread is simultaneously trying to increment a single object
View ArticleDeveloping Android Apps with Scala and Scaloid
Build a UI layout by writing type-safe Scala code and wire your logic into the layout
View ArticleA Massively Parallel Stack for Data Allocation
A fast, constant, type memory allocator and parallel stack are essential for initiating kernel launches from the CUDA device
View Article