Machine Learning on OpenShift and Kubernetes

Red Hat’s customers are increasingly investing in and adopting artificial intelligence (AI) and machine learning (ML) to better serve their customers, create value, grow their business, and reduce cost and complexity.

“Our customers see great potential in using Machine Learning (ML) to solve their business challenges. Technical advances in hardware acceleration and innovation in open source frameworks make ML a viable tool,” said Chris Wright, Chief Technology Officer, Red Hat.

We are listening to our customers. We are teaming up with Google and other members of the Kubernetes community with a goal of creating a strong open source community for AI and ML on Kubernetes and OpenShift — Red Hat’s enterprise Kubernetes platform.

Our goal is to create a future where developers and data scientists can easily access and consume AI and ML technologies and capabilities in support of their business and organizational goals.

AI and ML have the potential to create new opportunities for consumers and enterprises alike. However, to realize these opportunities, developers and operators will need to navigate a vast and complex space that builds upon fast-moving industry changes such as cloud, big data, high-performance computing (HPC), specialized hardware accelerators (such as GPUs and FPGAs), and specialized ML frameworks.

The typical ML workflow involves defining business objectives, collecting private and public data, refining and storing the data, and iteratively creating and validating useful models that must be put into production. Since nothing occurs in a vacuum, these ML methods are implemented as components of intelligent applications, which serve and apply predictive models to real-world, high-velocity information in order to provide key functionality. All this must be done in a repeatable, scalable, and resilient manner. Furthermore, some of this can require specialized (and often expensive) hardware resources, which increases the importance of resource management and utilization.

While data scientists have access to data and hardware for training and serving models today, the entire process can be bespoke, complex, inflexible, incomplete, and not scalable. As a result, the impact of advances in ML has largely been siloed, muted, and limited. We see the need for ways to break through this barrier and accelerate the application of ML.

Enter containers and Kubernetes. Developers and operators are increasingly embracing Linux containers and Kubernetes to help accelerate application development and deployment. Containers and Kubernetes abstract and simplify access to underlying infrastructure and provide robust capabilities to manage application lifecycle and development workflows. OpenShift, with additional capabilities for self-service, build and deployment and automation, further enhances this experience. Additional features in security, storage, networking, monitoring, and observability make it well suited for enterprise environments.

OpenShift is therefore well positioned to manage the complexity of ML and to democratize access to these techniques. In the words of Chris Wright,

Using OpenShift, Red Hat’s enterprise Kubernetes distribution, data scientists can easily deploy any containerized ML stack at scale on public or private infrastructure for training and querying models.”

Red Hat is already contributing in this space.

Red Hat OpenShift engineers are the founding members of the Kubernetes Resource Management Working Group. This group is contributing towards:

  • Integration and access to specialized hardware resources such as GPUs, FPGAs, and Infiniband;
  • Specialized features such as exclusive cores, CPU pinning strategies, hugepages, and NUMA;
  • Optimizing the access and efficiency of resources with robust scheduling, prioritization, and preemption capabilities; and
  • Performance benchmarking and tuning.

Red Hat is also working with hardware vendors and the community to help simplify the access and management lifecycle of hardware drivers and libraries for the benefit of our customers.

Last year, Red Hat launched radanalytics.io, an open source project aimed at making it easier to build intelligent applications on OpenShift. With twin goals of supporting the data science workflow and making machine learning capabilities accessible to enterprise developers, the radanalytics.io community brings technologies like Apache Spark, Project Jupyter, TensorFlow, Apache Kafka, AMQP, Ceph, S3, and OpenShift S2I together on OpenShift. Red Hat engineers not only contribute to many of these projects, but a Red Hat engineer co-leads the Kubernetes Big Data Special Interest Group (SIG) and has been leading the effort to introduce first-class support for scheduling on Kubernetes into the upstream Apache Spark project.

In response to strong interest from the Red Hat ecosystem, Red Hat, along with Google and other members of the OpenShift Commons, have launched the OpenShift Commons Machine Learning SIG to further the conversations with the community around best practices for ML workloads on OpenShift. Red Hat engineers and others in the community have been gathering to discuss, develop, and disseminate best practices for deploying ML applications and workloads on OpenShift.

Today we are taking another step forward and working with the community on the Kubeflow project. Our intent is to make Kubeflow a vendor-neutral, open community with the mission to make machine learning on Kubernetes easier, portable and more scalable.

“We’re ecstatic that Red Hat has joined the Kubeflow community and is bringing their knowledge of large-scale deployments to the project,” said David Aronchick, Product Manager on Kubeflow. “With OpenShift’s native Kubernetes implementation and success in major companies around the world, Kubeflow gives you the opportunity bring ML and OpenShift together on a single platform.”

“We’re excited to work with the nascent Kubeflow community leveraging open source ML tools and Kubernetes to make ML workflows ubiquitous.” added Chris Wright.

In conclusion, Red Hat is working with Google and other Kubernetes community members to further hone the mission and objectives of Kubeflow and put into motion a collaborative plan to achieve a truly open and collaborative community. We look forward to lending our engineering and community-building talents alike to this new project.

We encourage interested members of larger Red Hat ecosystem and community to join us in this.

Get Started with Machine Learning:

Related GitHub projects:

Related OpenShift Commons Briefings:

OpenShift Commons Machine Learning SIG:

Categories
Kubernetes, News
Tags
, ,