Deep Learning Workshop at codecentric AG in Solingen

6.2.2018 | 6 minutes reading time

Big Data – a buzz word you can find everywhere these days, from nerdy blogs to scientific research papers and even in the news. But how does Big Data Analysis work, exactly? In order to find that out, I attended the workshop on “Deep Learning with Keras and TensorFlow”.

On a stormy Thursday afternoon, we arrived at the modern and light-flooded codecentric AG headquarters. There, we met performance expert Dieter Dirkes and Data Scientist Dr. Shirin Glander. In the following two days, Shirin gave us a hands-on introduction into the secrets of Deep Learning and helped us to program our first Neural Net. After a short round of introduction of the participants, it became clear that many different areas and domains are interested in Deep Learning: geologists want to classify (satellite) images, energy providers want to analyse time-series, insurers want to predict numbers and I – a humanities major – want to classify text. And codecentric employees were also interested in getting to know the possibilities of Deep Learning, so that a third of the participants were employees from the company itself.

The second day began with a tasty Latte Macchiato and a nice “Good Morning” in the big kitchen of codecentric headquarters. The friendly room, with real moss on the only wall that’s not glass, emanated a very creative air and made the start of the day much easier. For this day, Shirin had a Neuronal Net prepared, that we could use to apply and try out what we’ve learned. We spent a lot of time exploring the effects that changing different parameters had on the performance of the Neural Net. During the practical part, many questions came up, that Shirin answered and discussed with us. We also talked about our own projects and ways to advance them with what we learned in the workshop.

Deep Learning with Artificial Neural Nets

Artificial Neural Nets (ANN) can be used to analyse highly complex data. Deep Learning is a subdomain of machine learning, which is again a subdomain of Artificial Intelligence. ANNs are modelled after the human brain, which has been adapted to learn efficiently and transfer knowledge to new scenarios. Researchers tried to mimic the brains ability to learn by building ANN, which are able to learn from training data and transfer this knowledge to new data.

Already back in 1958, Frank Rosenblatt [1] presented his concept of ANNs, the so called perceptron. The perceptron works with a numeric input and uses weighted connections in a node or neuron; together with an activation function, it calculates the result.

With increasing computational power and increasing amounts of data, multi-layer perceptrons and deep neural nets became more and more prevalent. Alongside the boom in machine learning, more and more deep learning frameworks, libraries and programming interfaces were developed. Today, there is a plethora of easy-to-use and open-source materials for deep learning. The bar to get your feet wet with this complex topic is as low as never before; you need almost no deep mathematical knowledge any more in order to build neural nets.

TensorFlow and Keras

TensorFlow and Keras are such deep learning APIs. TensorFlow has been developed by Google Brain and has been written in Python and C++. Right now, it is one of the most often used open-source libraries for machine learning. Keras builds on top of it and works specifically with neural nets. Same as TensorFlow, it is open-source and has been written in Python. There are several other libraries that can be used as backend with Keras, e.g. Theano and CNTK. TensorFlow is based on the concept of data flow graphs, meaning that every neural net can be seen as a form of mathematical operation in nodes and multi-dimensional data objects, the so called tensors, as edges. With TensorFlow we can use the integrated TensorBoard. TensorBoard is a collection of visualisation tools, that makes it easy to analyse processes, potential errors and optimize neural nets.

Image classification with Neural Nets

Most of the research into neural nets has so far involved image classification. The most famous data set for this task is MNIST [2], which contains 28 x 28 pixel images of hand-written images and their respective labels. The MNIST dataset is optimal for getting started with image classification and neural nets. Classification with neural nets falls into the category of supervised learning because it uses examples to train models. The hand-written digits of the MNIST dataset can be used to learn a classification task, namely which pixel arrangement corresponds to which digit.

The neural net will be fed input data from the training set, which it will pass on to the next layer of neurons. The initial weights were picked at random and every neuron in a layer will calculate an output based on the incoming data and the weights. The output will then be passed on again to following layers. In the final layer, a softmax function calculated the probabilities, with which it sorts the input image to the classes. The prediction of this input will be the class with the highest probability. Because we know the correct class for every image in our training set, we can calculate the difference between prediction and correct class. This is called cross-entropy. A cross-entropy of 0 would mean that the neural net predicted the correct class with 100% certainty. After every cycle of training, we calculate an error rate, the so called loss, which represents the cross-entropy of the entire training set.

During training, we aim to minimize the loss function. In order to achieve this, the weights of the neurons need to be optimized, e.g. by using backpropagation with stochastic gradient descent. The direction in which to change the weights is found with a gradient; we want to “descend” this gradient to find the smallest error rate. The step size with which we traverse the gradient landscape is defined by the learning rate. We can start with big steps to prevent getting stuck in local minima, but with advanced training we want to reduce the learning rate so that we can – ideally – reach the global minimum. With a loss function of 0, we would have achieved the perfect result.

The final evaluation of our neural net’s performance is measured on independent test data. We want to avoid that our net learned the training set too specifically and isn’t able to generalize any more. There are many different parameters that can be adjusted when training a neural net to find the optimal composition for our classification tasks. We try out different combinations and observe the effect they have on the result.

I want to thank Shirin for a great workshop! You made the complex topic of deep learning and neural nets much more accessible to us.

Maria Hartmann

—

Workshop description: https://www.codecentric.de/schulung/deep-learning-mit-keras-und-tensorflow/

[1] Rosenblatt, Frank (1985). The Perceptron: A Probabilistic Model for Information Storage and Organization in the Brain. Web: http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.335.-3398&rep=rep1&type=pdf

[2] LeCun, Yann. The MNIST Database of Handwritten Digits. Web: http://yann.lecun.com/-exdb/mnist/

—

Guest author Maria Hartmann studies Digital Humanities at the University of Trier. Through her love of linguistics she came to the classification of semi-structures texts with neural nets – which she now works on for her Master’s thesis. TensorFlow and Keras help her in her studies.

Was this post helpful?

Blog author

Shirin Elsinghorst

People Lead & Principal Consultant Data/AI

Do you still have questions? Just send me a message.

fromShirin Elsinghorst

Looking beyond accuracy to improve trust in machine learning

Traditional machine learning workflows focus heavily on model training and optimization; the best model is usually chosen via performance measures like accuracy or error and we tend to assume that a model is good enough for deployment if it passes certain...

Data
Machine Learning
Python

9.1.2018 | 11 minutes reading time

Shirin Elsinghorst

Explore Predictive Maintenance with flexdashboard

Predictive Maintenance Predictive Maintenance is an increasingly popular strategy associated with Industry 4.0; it uses advanced analytics and machine learning to optimize machine costs and output (see Google Trends plot below). A common use case for...

Big Data
Data
Machine Learning

2.11.2017 | 3 minutes reading time

Shirin Elsinghorst

Data Science for Fraud Detection

What is fraud and why is it interesting for Data Science? Fraud can be defined as “the crime of getting money by deceiving people” (Cambridge Dictionary); it is as old as humanity: whenever two parties exchange goods or conduct business, there is the...

Big Data
Data
Machine Learning

5.9.2017 | 10 minutes reading time

Shirin Elsinghorst

Social Network Analysis and Topic Modeling of codecentric’s Twitter friends...

Recently, Matthias Radtke has written a very nice blog post on Topic Modeling of the codecentric Blog Articles , where he is giving a comprehensive introduction to Topic Modeling. In this article I am showing a real-world example of how we can use Data...

Open Source
AI
Data
Data Science

24.7.2017 | 8 minutes reading time

Shirin Elsinghorst

Your job at codecentric?

Jobs

Agile Developer und Consultant (w/d/m)

Alle Standorte

Introducing Data Interface Quadrants (DIQs)

In today’s rapidly evolving, data-driven world, organisations face an increasingly complex challenge: how to design, implement, and manage data interfaces that meet both immediate operational demands and long-term strategic business objectives. A data...

API
Data

30.1.2025 | 8 [Missing String "readingTime"]

Daniel Kocot

Miriam Greis

Open Source hits Billion-Dollar Market: DeepSeek-R1 is shaking up the ...

On January 27, 2025, the technology stock exchange experienced an unexpected crash: The NVIDIA stock price plummeted by over 17%, temporarily wiping out nearly $600 billion in market value and setting a new historical record in the stock market. Many...

AI
Generative AI
LLM

29.1.2025 | 8 [Missing String "readingTime"]

How we can hack an AI with just a few words

How we can hack an AI with just a few words Artificial intelligence (AI) has undergone an astonishing transformation in recent years and is now present in many areas of life. Whether in the form of chatbots that help us with everyday questions or generative...

IT-Security
AI

27.1.2025 | 4 [Missing String "readingTime"]

Access Databricks UnityCatalog from duckdb

Databricks is a great platform when it comes to data management and governance, mostly due to the unity catalog. But Spark as an engine for processing the data is just ok'ish, especially when data is not really big. New engines like polars, datafusion...

Data

20.1.2025 | 5 [Missing String "readingTime"]

Matthias Niehoff

Charge your APIs Volume 36 - Trends for 2025

As 2025 approaches, new trends are emerging in the world of APIs. After 2024 was user-centric, the focus is now shifting back to developer needs and increasing productivity. APIs are evolving and the technologies surrounding them are becoming more powerful...

Integration
API
Data
Software architecture

11.12.2024 | 5 [Missing String "readingTime"]

Daniel Kocot

Simplifying LLM Application Development: A Newcomer's Perspective

I. Introduction Large Language Models (LLMs) have become highly popular due to their transformative impact on various fields, especially within IT. They enable developers to create innovative software applications centered around AI interactions, offering...

Generative AI
AI

6.12.2024 | 13 [Missing String "readingTime"]

Function Calling with GPT Models

GenAI is a powerful tool for generating content and interacting with applications using natural language. However, this tool also has significant limitations when you plan to use it in your own software. GenAI's knowledge is limited to information that...

Generative AI
AI
LLM

6.9.2024 | 5 [Missing String "readingTime"]

When Business Meets Technology: From Data Product to Data Architecture...

Abstract The Data Product Canvas (DPC) is a tool for the lightweight and iterative definition of data products. It increases the efficiency of product definition by clearly presenting the key impact areas on data products. Additionally, the DPC motivates...

Software architecture
Data
DDD
Digital product developement

6.8.2024 | 24 [Missing String "readingTime"]

Charge your APIs Volume 28: Empowering application and data integration...

In today's fast-paced world, seamless application and data integration is crucial for organisational success. This blog explores how frameworks like Maslow's Pyramid, Team Topologies, Evolutionary Architectures, API Federation, and API Marketplaces, ...

API
Data
Integration

25.7.2024 | 8 [Missing String "readingTime"]

Daniel Kocot

Data for the Masses Volume 2: Data Products, Data Contracts and API Contracts

The pillars of modern data architectures as success factors for organisations In the digital economy, a well-thought-out data architecture and the efficient use of data are crucial for organisational success. Data products, data contracts and API contracts...

Data
API

13.6.2024 | 7 [Missing String "readingTime"]

Daniel Kocot

Becoming a Data-Driven Company with Applied Data Products

In recent years, the hype surrounding the value of data has grown continuously, and a multitude of concepts and methods have emerged on how companies can become 'data-driven'. From strategic top management to detail-oriented data analysts attempts are...

Agile
Big Data
Data
Product management
Digitalization
Data Science
Business Intelligence

18.5.2024 | 9 [Missing String "readingTime"]

A/B Testing: Tool support and testing GrowthBook

In the previous blog post we introduced some general concepts of A/B testing: we explored the main aspects, defined test types and explained the most common statistical methods. Now we want to explore the areas in which A/B testing tools can provide...

Testing
Python
Data
UX/UI
Analysis
JavaScript

18.3.2024 | 20 [Missing String "readingTime"]

Francesca Diana

A/B Testing: An introduction

This blog series aims to aid teams who are contemplating adding A/B testing to their toolkit but are unsure of which tool to use. In addition to helping with tool selection, the series also provides the entire team with a consistent initial understanding...

Testing
Data
UX/UI
Analysis

6.2.2024 | 29 [Missing String "readingTime"]

Francesca Diana

Data for the Masses Volume 1: The Digital Product Passport - A Key Element...

The Digital Product Passport represents a significant shift for digital units within organisations, compelling them to ensure comprehensive data transparency. This tool not only serves as a product's digital fingerprint but also opens up new dimensions...

Data
Product management

25.1.2024 | 7 [Missing String "readingTime"]

Daniel Kocot

Answer questions about your documents with OpenAI and Pinecone

In recent years, large language models (LLMs) have made remarkable progress in interacting with humans, showcasing their ability to answer a wide array of questions. Trained on publicly accessible internet content, these models have broad knowledge across...

13.11.2023 | 12 [Missing String "readingTime"]

Lukas Lehmann

Charge your APIs: NordicAPIs Platform Summit Edition - API first ... not...

In the ever-evolving landscape of software development, buzzwords and paradigms come and go. One such term that has gained significant traction in recent years is "API-First Development." It's been hailed as the holy grail of modern software engineering...

API
Data

19.10.2023 | 5 [Missing String "readingTime"]

Daniel Kocot

An introduction to federated learning in an industrial context: Advanced

In the Machine Learning space, it was long believed that sharing learnings or weights was safe in the sense that the input data couldn't be extracted. However, this belief has been challenged by researchers coming out over the years. Nowadays, numerous...

Machine Learning
Big Data
Data Science
Data

18.9.2023 | 9 [Missing String "readingTime"]

An introduction to federated learning in an industrial context: Fundamentals

With the help of data, companies are able to make more informed decisions, optimize their workflows and gain an edge in the competitive world of business using the power of Machine Learning (ML). However, handling data has become increasingly difficult...

Machine Learning
Data Science
Data
Big Data

25.8.2023 | 8 [Missing String "readingTime"]

Charge your APIs Volume 13: Data meets APIOps

In the swirling digital vortex that modern businesses navigate, two things stand clear as day: our escalating reliance on Application Programming Interfaces (APIs) and the immeasurable value of data. The API Operations (APIOps) pipeline, with its automated...

API
Data

24.8.2023 | 11 [Missing String "readingTime"]

Daniel Kocot

Fighting Gandalf with magic spells (the spells are prompt injections) ...

Note: Do not attack any systems for which you do not have explicit permission to do so. In this article, I will recount the tale of outwitting a large language model by performing prompt injection attacks. Before we start, let's establish a common baseline...

IT-Security
AI

10.7.2023 | 12 [Missing String "readingTime"]

Michael Wagner

Deep Learning Workshop at codecentric AG in Solingen

Deep Learning with Artificial Neural Nets

TensorFlow and Keras

Image classification with Neural Nets

Was this post helpful?

Blog author

More articles

Looking beyond accuracy to improve trust in machine learning

Explore Predictive Maintenance with flexdashboard

Data Science for Fraud Detection

Social Network Analysis and Topic Modeling of codecentric’s Twitter friends...

Your job at codecentric?

Agile Developer und Consultant (w/d/m)

More articles in this subject area

Introducing Data Interface Quadrants (DIQs)

Open Source hits Billion-Dollar Market: DeepSeek-R1 is shaking up the ...

How we can hack an AI with just a few words

Access Databricks UnityCatalog from duckdb

Charge your APIs Volume 36 - Trends for 2025

Simplifying LLM Application Development: A Newcomer's Perspective

Function Calling with GPT Models

When Business Meets Technology: From Data Product to Data Architecture...

Charge your APIs Volume 28: Empowering application and data integration...

Data for the Masses Volume 2: Data Products, Data Contracts and API Contracts

Becoming a Data-Driven Company with Applied Data Products

A/B Testing: Tool support and testing GrowthBook

A/B Testing: An introduction

Data for the Masses Volume 1: The Digital Product Passport - A Key Element...

Answer questions about your documents with OpenAI and Pinecone

Charge your APIs: NordicAPIs Platform Summit Edition - API first ... not...

An introduction to federated learning in an industrial context: Advanced

An introduction to federated learning in an industrial context: Fundamentals

Charge your APIs Volume 13: Data meets APIOps

Fighting Gandalf with magic spells (the spells are prompt injections) ...