DolphinScheduler 3.2.0 Installation Guide and Troubleshooting Deep Dive

Written by zhoujieguang | Published 2025/06/06
Tech Story Tags: dataops | apache-dolphinscheduler | opensource | technical-writing | data-science | single-node-deployment | test-report | dolphinscheduler-3.2.0

TLDRFrom setup to troubleshooting to performance & security evaluation — everything you need in one place.via the TL;DR App

Author’s Note: Although this test confirms that the 3.2.0 single-node deployment is fully functional, the community still recommends version 3.1.9 for production use due to its higher stability.)

General Overview

From environment setup and deployment strategy, to step-by-step installation and troubleshooting common issues, this guide concludes with a brief test report. The structure is illustrated in the diagram below:

I. Deployment Environment

  • Java Version: 1.8.0_181
  • Operating System: CentOS Linux release 7.6.1810
  • MySQL Version: 5.7.22-22-log
  • MySQL Driver Version: 8.0.16

II. Version Info

  • DolphinScheduler Version: 3.2.0
  • Note: This version supports more domestic databases (e.g., Dameng), catering to localization needs.
  • Download: DolphinScheduler 3.2.0 Download

III. Deployment Plan

  • Use Case: Platform used for data quality monitoring and alerting; no high-concurrency demand
  • Deployment Type: Single-node
  • Data Storage: External MySQL for persistence

IV. Deployment Steps

4.1 Upload and Extract Deployment Package

tar -xvzf apache-dolphinscheduler-3.2.0-bin.tar.gz

4.2 Create External Database

Create a new database instance and user, granting full privileges to the user.

Note: Replace {user} and {password} with your actual MySQL credentials.

mysql -uroot -p
mysql> CREATE DATABASE dolphinscheduler DEFAULT CHARACTER SET utf8 DEFAULT COLLATE utf8_general_ci;
mysql> GRANT ALL PRIVILEGES ON dolphinscheduler.* TO '{user}'@'%' IDENTIFIED BY '{password}';
mysql> GRANT ALL PRIVILEGES ON dolphinscheduler.* TO '{user}'@'localhost' IDENTIFIED BY '{password}';
mysql> FLUSH PRIVILEGES;

4.3 Modify Metadata Database Configuration

Configure the metadata database to use MySQL for persistence. Modify the following file:

Path: ./apache-dolphinscheduler-3.2.0-bin/bin/env/dolphinscheduler_env.sh

Note: Update the IP address, port, database username and password.

Important: Do not change ${DATABASE}.

export DATABASE=mysql
export SPRING_PROFILES_ACTIVE=${DATABASE}
export SPRING_DATASOURCE_URL="jdbc:mysql://xxx.xxx.xxx.xxx:23306/dolphinscheduler?useUnicode=true&characterEncoding=UTF-8&useSSL=false&allowPublicKeyRetrieval=true"
export SPRING_DATASOURCE_USERNAME=dolphinscheduler
export SPRING_DATASOURCE_PASSWORD=xxxxxxxxx

4.4 Upload MySQL Driver

Upload the driver to the following directories:

../apache-dolphinscheduler-3.2.0-bin/standalone-server/libs/standalone-server  
../apache-dolphinscheduler-3.2.0-bin/api-server/libs  
../apache-dolphinscheduler-3.2.0-bin/alert-server/libs  
../apache-dolphinscheduler-3.2.0-bin/master-server/libs  
../apache-dolphinscheduler-3.2.0-bin/tools/libs  
../apache-dolphinscheduler-3.2.0-bin/worker-server/libs  

4.5 Initialize External Database

sh ../apache-dolphinscheduler-3.2.0-bin/tools/bin/upgrade-schema.sh

4.6 Start and Stop Services

Start single-node:

sh ../apache-dolphinscheduler-3.2.0-bin/bin/dolphinscheduler-daemon.sh start standalone-server

Check status:

sh ../apache-dolphinscheduler-3.2.0-bin/bin/dolphinscheduler-daemon.sh status standalone-server

Stop single-node:

sh ../apache-dolphinscheduler-3.2.0-bin/bin/dolphinscheduler-daemon.sh stop standalone-server

4.7 Access Web UI

Access URL: http://xxx.xx.xx.xxx:12345/dolphinscheduler Default credentials: admin/dolphinscheduler123


V. Common Issues & Fixes

5.1 Time Mismatch

  • Issue: Task creation time on UI differs from actual system time.
  • Fix: After logging in, switch the timezone to "Asia/Shanghai" in the top-right menu. Otherwise, task creation/update times will appear incorrect.

5.2 Abnormal Termination

  • Issue: If standalone-server terminates abnormally (e.g., high load, manual kill), restarting may leave many tasks in "Running" state that cannot be deleted.
  • Fix 1: First, offline the workflow definition, then restart the standalone-server. The issue is caused by previously submitted future scheduling plans being executed again after restart, resulting in deadlock. This is a temporary workaround.
  • Fix 2: If the first fix doesn’t work, try clearing data in the t_ds_task_instance (task instance records) and t_ds_process_instance (workflow instance records) tables, or delete specific rows.

5.3 Excessive Logs

  • Issue: Large log files are generated during runtime.
  • Fix: Logs for standalone-server are stored under the standalone-server/logs directory. Set up automated scripts to clean them periodically.

5.4 Hive Connection Failure in Data Source Center

  • Issue: Error message during registration:

    Required field ‘client_protocol’ is unset! Struct: TOpenSessionReq(client_pro
    
  • Analysis: Caused by version mismatch between DolphinScheduler’s Hive JDBC and your HDP Hive.

  • Fix: Replace JDBC driver with the one matching your Hive cluster version:

  1. Backup existing JDBC drivers:
mv ../apache-dolphinscheduler-3.2.0-bin/api-server/libs/hive-jdbc-2.3.9.jar ../apache-dolphinscheduler-3.2.0-bin/api-server/libs/hive-jdbc-2.3.9.jar.bak  
mv ../apache-dolphinscheduler-3.2.0-bin/worker-server/libs/hive-jdbc-2.3.9.jar ../apache-dolphinscheduler-3.2.0-bin/worker-server/libs/hive-jdbc-2.3.9.jar.bak
  1. Upload the correct Hive JDBC jar files to the above paths.

5.5 SQL Script Using Hive UDFs

  • Issue: Error when using SQL script task with registered Hive UDF:

    [ERROR] 2024-02-23 15:36:07.355 +0800 - execute sql error: Error while compiling statement: FAILED: ParseException line 1:18 missing KW_VIEW at ‘temporary’ near ‘replace’ in table name
    
  • Analysis: Hive does not support CREATE OR REPLACE TEMPORARY FUNCTION.

  • Fix: Modify the SQL grammar in the source code and replace the JAR files:

  1. Backup existing task JARs:
mv ../apache-dolphinscheduler-3.2.0-bin/api-server/libs/dolphinscheduler-task-sql-3.2.0.jar ../apache-dolphinscheduler-3.2.0-bin/api-server/libs/dolphinscheduler-task-sql-3.2.0.jar.bak  
mv ../apache-dolphinscheduler-3.2.0-bin/worker-server/libs/dolphinscheduler-task-sql-3.2.0.jar ../apache-dolphinscheduler-3.2.0-bin/worker-server/libs/dolphinscheduler-task-sql-3.2.0.jar.bak  
mv ../apache-dolphinscheduler-3.2.0-bin/master-server/libs/dolphinscheduler-task-sql-3.2.0.jar ../apache-dolphinscheduler-3.2.0-bin/master-server/libs/dolphinscheduler-task-sql-3.2.0.jar.bak
  1. Upload the updated JARs with corrected SQL syntax.

VI. Test Report

6.1 Test Summary

Current Release: 3.2.0 Functional, compatibility, and security tests have been completed and passed. Ready for release.

No.

Functionality Description

Result

Notes

1

Workflow Definition - Data Quality Config

✅ Passed

2

Workflow Instance

✅ Passed

3

Workflow Scheduling

✅ Passed

4

Task Definition

✅ Passed

5

Task Instance

✅ Passed

6

UDF Management

✅ Passed

7

Task Group Management

✅ Passed

8

Data Quality Results

✅ Passed

9

Quality Rule Management

✅ Passed

10

Data Source Center

✅ Passed

11

Alert Instance Management

✅ Passed

12

Alert Group Management

✅ Passed

6.2 Compatibility & Performance Evaluation

Environment

Result

Java: 1.8.0_181, OS: CentOS Linux 7.6.1810, MySQL: 5.7.22, Driver: 8.0.16

✅ Passed

6.3 Performance Metrics

Scenario

Name

Metric Description

Actual Result

Verdict

1

Throughput

Tasks processed per unit time

100 tasks run in parallel, finished in 5 mins

✅ Passed

2

Response Time

Time from submission to execution

Data quality check completed within 1 min

✅ Passed

3

Concurrent Users

Simultaneous active users

10 concurrent users supported

✅ Passed

4

CPU Usage

CPU utilization during runtime

<5% when idle

✅ Passed

5

Memory Usage

Memory usage during runtime

<5% when idle

✅ Passed

6.4 Security Metrics & Evaluation

6.4.1 Test Indicators

Vulnerability Name

Spring Boot Actuator Unauthorized Access

Risk Level

High

Vulnerability Description

Actuator is a set of monitoring and management functions provided by Spring Boot for application systems. It can view detailed information about application configurations, such as automated configuration information, created Spring beans information, system environment variable configurations, and detailed information about Web requests. If used improperly or due to some unintentional oversights, it may cause serious security risks such as information leakage. When opening the env or jolokia interface, it may lead to a remote code execution vulnerability under specific configurations.

Vulnerability Link

http://172.30.10.153:12345/dolphinscheduler/ui

Vulnerability Parameters

["GET", "http://172.30.10.153:12345/dolphinscheduler/ui", "", "http://172.30.10.153:12345/dolphinscheduler/actuator/metrics", "", "", ""]

Judgment Details

http://172.30.10.153:12345/dolphinscheduler/ui

Solution

Introduce the spring-boot-starter-security dependency and add security authentication: <dependency> <groupId>org.springframework.boot</groupId> <artifactId>spring-boot-starter-security</artifactId> </dependency> management.security.enabled=true security.user.name=username security.user.password=password

6.4.2 Fix Instructions

  1. Disable Actuator Modify config files in: apache-dolphinscheduler-3.2.0-bin/standalone-server/conf
  2. Disable Swagger Add config to: apache-dolphinscheduler-3.2.0-bin/standalone-server/conf
  • Verify Actuator is disabled: Accessing http://${ip}:12345/dolphinscheduler/actuator/metrics should return nothing.
  • Verify Swagger is disabled: Accessing http://${ip}:12345/dolphinscheduler/swagger-ui/index.html should return nothing.

6.4.3 Test Conclusion

After fixing vulnerabilities, the system is safe for release.


Written by zhoujieguang | Apache DolphinScheduler Committer
Published by HackerNoon on 2025/06/06