Oil painting teaching design based on the mobile platform in higher art education

July 5, 202414 Mins Read

Design of extraction method

The ends of the brushstrokes tend to form more distinctive tracks that appear on the edges of the image. Image edge refers to the local area where the brightness visible to the naked eye changes and the pixel mutation occurs²⁷. Edge detection methods usually consist of various operators. Image edges of paintings generally include three forms: mutant, slow, and linear²⁸. Among them, mutant refers to the edge of the image where the mutation of pixel value occurs in a certain area. Slow type signifies that the gray value (GV) of pixel points keeps stable for some time, then suddenly changes at some point, and then tends to be stable. Linear edge means the edge where the GV changes rapidly and restores the original GV in a short time. Based on this, the first hypothesis is proposed. Compared with the color feature, the brush feature is more important in image classification.

Current edge detection methods can be divided into edge detection by the first-order and second-order differential operation²⁹. Among them, the Canny Operator and Laplace Operator are commonly used in the second-order differential operation. The algorithm steps of the Canny Operator cover removing noise, finding gradient value, and detecting edge.

The Laplace Operator is defined as Eq. (1):

$$\Delta f={\nabla }^{2}f=\nabla \cdot \nabla f$$

(1)

$\nabla f$ refers to the gradient; $\nabla \cdot f$ stands for divergence.

In recent years, mathematical morphology has been widely applied in machine vision research. It is an image analysis subject based on lattice theory and topology, serving as the foundational theory for mathematical morphology image processing. The fundamental operations encompass opening & closing operations, erosion and dilation, skeleton extraction, limiting erosion, hit-and-miss transformation, morphological gradient, Top-hat transformation, particle analysis, watershed transformation, etc. The scope of research in mathematical morphology spans the design of element composition, the exploration of a morphological algorithm, the creation of the improved filter, the research of dynamic things, the processing of rich information, the implementation of optical hardware, the research of nonlinear wavelet, the exploration of multi-resolution signal, and the optimization algorithm³⁰.

The current trend of morphological operation is to improve its common usage and enhance its practicability. Common operations in morphological operations involve dilation, erosion, closing, and opening³¹. The erosion operation can be written as Eq. (2):

$$A\ominus B=\left\{x,y\mid (B{)}_{xy}\subseteq A\right\}$$

(2)

A expresses the image set; B refers to the binary element in the set; x and y represent the image’s pixel. Equation (2) indicates that structure B is used to erode A.

Dilation can “enlarge” the range of the target area, merging background points in contact with the area into the target, and expanding the target boundary outside. Its function is to fill some holes in the target region and eliminate small particle noise contained in the region³². The dilation operation is expressed in Eq. (3).

$$A\oplus B=\left\{x,y\mid (B{)}_{xy}\cap A\ne {\varnothing }\right\}$$

(3)

Equation (3) means that structure B is adopted to dilate A, and the origin of structure element B is translated to the position of the image pixel (x, y). If the intersection of B and A at (x, y) is not empty (that is, at least one of the image values corresponding to A at the position of element 1 in B is 1), then the x corresponding to the output image is assigned a value of 1. Otherwise, it is assigned a value of 0.

The definitions of opening operation ($A\circ B$) and closing operation ($A\cdot B$) are as follows.

$$A\circ B=A\ominus B\oplus B$$

(4)

$$A\cdot B=A\oplus B\ominus B$$

(5)

The opening operation is to perform the erosion operation first, and then the division operation; The opposite is true for closing operations.

Compared with other operators, the Sobel operator has a larger filter size and better anti-noise. Thus, the edge detection method in this study uses the Sobel algorithm, which is applied to the gray image to recognize edge pixels³³. The Sobel operator belongs to the edge detection algorithm based on the first-order gradient algorithm. In the process of brushstroke extraction, the convolution kernel is shown in Eqs. (6) and (7):

$${G}_{X}=\left[\begin{array}{ccc}-1& 0& +1\\ -2& 0& +2\\ -1& 0& +1\end{array}\right]$$

(6)

$${G}_{Y}=\left[\begin{array}{ccc}-1& -2& -1\\ 0& 0& 0\\ +1& +2& +1\end{array}\right]$$

(7)

${G}_{X}$ and ${G}_{Y}$ represent the convolution kernel; X and Y display directions of the coordinate axis.

The mathematical expressions of Sobel operators read:

$$\begin{aligned}{G}_{x} & ={f}_{x}(x,y)=f(x-1,y+1)+2f(x,y+1)+f(x+1,y+1) \\ & \quad – f(x-1,y-1)-2f(x,y-1)-f(x+1,y-1)\end{aligned}$$

(8)

$$\begin{aligned}{G}_{y} &={f}_{y}(x,y)=f(x+1,y-1)+2f(x+1,y)+f(x+1.y+1) \\ & \quad – f(x-1,y-1)-2f(x-1,y)-f(x-1,y+1)\end{aligned}$$

(9)

f refers to the function operation for x and y.

The matrix form of the Sobel operator is illustrated in Eq. (10) and Eq. (11):

$${G}_{x}={f}_{x}(x,y)=\left[\begin{array}{ccc}-1& 0& 1\\ -2& 0& 2\\ -1& 0& 1\end{array}\right]$$

(10)

$${G}_{y}={f}_{y}(x,y)=\left[\begin{array}{ccc}-1& -2& -1\\ 0& 0& 0\\ 1& 2& 1\end{array}\right]$$

(11)

When the GV of pixels on the oil painting image changes in gradient, the change of the GV is the GV of each pixel in the image combined vertically and horizontally. The calculation method is implied in Eq. (12):

$$G=\sqrt{{G}_{x}^{2}+{G}_{y}^{2}}$$

(12)

In the process of brush feature extraction, the oil painting image is first converted into a gray image. Then the Sobel operator is used for operation combined with the 3*3 filter. Additionally, the image and filter are convolved to get the gradient image.

Compared with other operators, the Sobel operator’s filter size is larger, and its anti-mania properties are more significant. Regarding painting images, the grayscale change values of target and neighborhood pixels within a certain range are more sensitive, which can better grasp details and better reflect the shape and characteristics of brushstrokes. However, after brush feature extraction by the Sobel operator, there are still problems such as image edge fracture or a rather messy edge line prominence. For example, an edge line detected from an image might not be a single brushstroke. In a certain area of the image edge, some relatively complex edge lines are also highlighted. The edge line around the brushstrokes may not be completely sharp and thus break during detection, which is the problem above. In this study, the morphological operation method is employed to solve these two problems. After performing edge detection operations, some brushstrokes may not be fully detected due to a lack of sharpness around the edge lines. The opening of the morphological operation can smooth the outline of the edge of the painting and eliminate the fine brushstrokes in the edge area. Next, the closing operation connects the eliminated brush edge line and fills in the fracture. Thereby, the operation of opening first and then closing is adopted for brushstroke processing at the edge³⁴.

Here, a representative brush feature is selected for classification input. The steps of the image segmentation algorithm are denoted in Fig. 1.

Figure 1 illustrates the image segmentation method, which comprises four distinct steps. In step 1, the oil painting image undergoes morphological manipulation and is partitioned into 2,000 smaller components. In step 2, the greedy strategy is used to calculate the gradient similarity value between adjacent regions of each pair of sub-painting images, facilitating the merging of the two sub-painting images with the closest proximity. Step 3 involves sorting each region based on its density, while Step 4 entails selecting the brush feature that ranks among the top 6 in density. Ultimately, the brush feature with the top 6 densities is selected as the representative brush strokes of this painting. The rendering process of representative brushstrokes is plotted in Fig. 2.

Oil painting classification based on multiple features

A brush features-based CNN model is designed. Following the morphological operation and segmentation extraction of the image subsequent to edge detection, the resulting image is employed as input for CNN (size 64 × 64) for learning and training purposes. The CNN model consists of 5 layers, as demonstrated in Fig. 3.

In Fig. 3, the C1 layer utilizes six 5 × 5 cores to filter the input data and generate six 60 × 60 mappings. Subsequently, the S1 layer employs a sub-sampling rate of 2, performing maximum pooling layer operations on each mapping to reduce the feature size and model parameters. The C2 layer and the second convolutional layer use 12 5 × 5 kernels to filter the data operation, resulting in 12 26 × 26 mappings. The S2 layer continues to reduce the feature size. By designing the fully connected layer, 2,028-dimensional vectors are obtained, and ultimately, 1,014-dimensional features are output.

SVM is a kind of generalized linear classifier for the binary classification of data in supervised learning scenarios. Its decision boundary represents the maximum margin hyperplane for solving learning samples, transforming the problem into convex quadratic programming. Compared with logistic regression and neural networks, SVM offers a more transparent and potent method for learning complex nonlinear equations. The SVM model is usually used in the field of image processing. The basic principle of the model is presented in Fig. 4.

SVM maps vectors to a higher dimensional space, establishing a hyperplane with a maximum margin. Two parallel hyperplanes are constructed on either side of the hyperplane separating data. The maximization of distance between these parallel hyperplanes is prioritized for optimal separation. The hypothesis posited is that a larger distance or gap between parallel hyperplanes corresponds to a reduced total error of the classifier. Consequently, the second hypothesis asserts that SVM exhibits superior performance in the context of image classification.

The basic calculation of SVM is signified in Eq. (13):

$$f(x)=\sum_{i=1}^{n} {w}_{i}k\left(x,{x}_{i}\right)+b$$

(13)

w and b refer to two parameters of hyperplane; w stands for the normal vector of the vector; i is the serial number; n represents n-dimensional feature space; $k\left(x,{x}_{i}\right)$ displays the kernel function; x means an argument.

SVM is to find the optimal hyperplane in a binary classification problem, and the hyperplane can be expressed as Eq. (14):

$$f(x)={w}^{T}\text{x}+b$$

(14)

The objective function of the kernel function in SVM is as follows:

$$\underset{w,\varepsilon ,b}{min} \left\{\frac{1}{2}\parallel w{\parallel }^{2}+c\sum_{i=1}^{n} {\varepsilon }_{i}\right\}$$

(15)

c denotes the regularization parameter; $\varepsilon $ means slack variable.

Lagrange multipliers are introduced to reconstruct the objective function through quadratic programming, as exhibited in Eq. (16):

$$\begin{array}{c}max\sum_{i=1}^{m} {\lambda }_{i}-\frac{1}{2}\sum_{i,i=1}^{m} {\lambda }_{i}{\lambda }_{i}{y}_{i}{y}_{j}k\left({x}_{i},{x}_{j}\right)\\ 0\le {\lambda }_{i}\le C,i=\text{1,2},3\dots m,\sum_{i=1}^{m} {\lambda }_{i}{y}_{i}=0\end{array}$$

(16)

$\lambda $ stands for the Lagrange multiplier; $m$ expresses the number of training data; $y$ represents the label of the painting image; $x$ refers to its brush feature representation.

This study divides the oil painting dataset into different genres, including Cubism, Renaissance, Baroque, Rococo, and Impressionism. The color feature is one of the important features of oil painting. Hence, this study also combines the color features of oil paintings for classification, which can be classified into five steps, as represented in Fig. 5.

The first step involves converting oil paintings into the Hue Saturation Value (HSV) model, encompassing hue, saturation, and brightness attributes of oil paintings³⁵. In the second step, the K-means clustering method is implemented in the HSV model, and the generated colors are quantized in space. In the third step, 20 colors with a value of 20 are randomly selected and employed for 20 clusters. The mean value of each class of colors is determined, and this process iterates until color stability is achieved, terminating the algorithm. Ultimately, 20 clustering centers are formed. In the fourth step, according to the number of pixels in the center point of K-means clustering, the colors are sorted from largest to smallest, and the top 6 are taken as the main colors of input features. In the fifth step, the color features and brush features extracted by the CNN are input into SVM together.

The separate color feature and the relevant feature combined with the brush features are employed to conduct the classification and comparison experiment of oil painting. The experimental process is portrayed in Fig. 6.

Construction of personalized painting teaching system by MP

With the swift growth of MPs such as iPads and mobile phones, the current education system is developed on the MPs. The personalized teaching system of intelligent oil painting constructed here also faces the challenge of MP. The construction of this teaching system combines AIT, collaborative filtering (CF) system, big data analysis (BDA) technology, and oil painting classification technology mentioned above. The core content includes a painting creation module, a personal center, a community module, and a painting auxiliary module³⁶.

The community module covers the selection of material learning content. This part applies the oil painting classification technology designed in this study to classify oil paintings according to diverse factions, thus improving the retrieval and learning efficiency of oil painting types in learning. Furthermore, it also encompasses live and video teaching. The painting creation module offers drawing tools and painting capabilities so that various templates can be used for painting. The painting auxiliary module involves intelligent recommendation, auxiliary tools, intelligent recognition, evaluation feedback, real-time auxiliary function, flow-process diagram generation, etc., which helps better complete personalized painting creation³⁷. The main contents of the personal center include a painting record, a collection, and so on. Among them, the CF system provides an intelligent push for oil paintings of related genres according to different search preferences.

Experimental dataset

The dataset used in this study was obtained from the online collection platform (https://gallerix.asia/a1/). The dataset comprises 1000 oil paintings from five different art styles (Baroque, Cubism, Impressionism, Renaissance, Rococo), with each style containing 200 artworks. During the dataset construction process, carefully selected representative oil paintings were chosen to ensure coverage of typical characteristics and stylistic differences in each style.

In the experiment, the dataset was divided into training and testing sets in a 75% to 25% ratio. Specifically, the training set for each art style consisted of 150 oil paintings, used for model training and parameter optimization. The testing set included 50 oil paintings for each style, used to evaluate the model’s performance on unseen data. This partitioning strategy aims to ensure that the model receives sufficient training and testing across various art styles, allowing for a more comprehensive assessment of its performance in the oil painting style classification task.

Experimental platform

The experimental environment is outlined in Table 2.

Table 2 The experimental environment.

In this study, a computer with Intel Core i7-7700 CPU and 8 GB memory is selected as the hardware environment to ensure the stability and efficiency of the experiment. At the same time, a 64-bit operating system is employed to make full use of computing resources. In the software, Windows 10 Home Edition is chosen as the operating system and Pycharm 2019.2 IDE (Community Edition) is used as the integrated development environment. Based on their stability, ease of use, and wide applicability, these choices can effectively support related research work.

Hyperparameter setting

This study’s hyperparameter settings are shown in Table 3:

Table 3 Hyperparameter setting.

In model training, hyperparameter settings have a significant impact on the final performance. The learning rate is adjusted among 0.001, 0.005, and 0.01 to balance the model’s convergence speed and stability. Batch sizes are set at 32, 64, and 128 to explore the impact of different data batches on training efficiency and model accuracy. The Adam and SGD optimizers are employed, each with its unique convergence properties and suitable scenarios. Activation functions ReLU and Sigmoid are chosen to test their performance in handling nonlinear relationships. Initial weights are set using random initialization and Xavier initialization to ensure a reasonable weight distribution at the start of training. Training epochs are set at 50, 100, and 150 to fully evaluate the model’s performance over diverse training periods. Regularization parameter L2 is used with λ = 0.01 to prevent overfitting and improve generalization ability.

Source link