Profile
Are you using the right model? You must understand data science before modeling For engineers, it is not difficult to learn to write programs and build models, but how to evaluate the applicability of the model or judge the appropriate model based on the data at hand is the real test. The mistake that many people tend to make is that regardless of the quality and source of the data, if they are directly thrown into the model for training, the effect will of course be limited. But for scientists, in addition to building a model with high accuracy, more often scientists hope to find out the correlation and rules between data through algorithms, so that humans can have a deeper understanding of the surrounding phenomena . Generally speaking, we often have several myths about models. First, the more complex the model, the better? 2. The higher the accuracy of the model, the better? However, we know that the application of data science often integrates multiple fields, such as engineering, physics, chemistry, biology, ecology, philosophy, or economics, which is no longer what a single major can handle in the past. Therefore, it is very important to understand the basic knowledge and concepts of data science disciplines, which can help modelers understand how to choose appropriate tools for practical situations, and avoid falling into the trap of plausible and plausible. The following is the "Modeling Basics of Data Science: Don't rush to code!" Do you know the pitfalls of models? Even if the mathematical model can fully explain the data to be analyzed, "when the number of parameters in it is large, or the function used is complex, it is difficult to judge whether the behavior of the model is just consistent with the data or whether it really captures the essence. will be improved. ’ In addition, it is also important that all elements of the model can be clearly stated in words (others must be convinced why the element is included in the model) “The strength of the argument derived from the mathematical model will be consistent with the logic in the model. Weaknesses have the same (or less) strength". Therefore, in order to draw strong conclusions, the reasons for including each element in the model must have a logically convincing explanation (why this variable, mathematical structure or parameter is used). Models don’t just have to be simple, they have to be able to explain the data But no matter how easy to understand, a model is meaningless if it cannot explain the data. Therefore, understanding-oriented modeling always has to find a balance between "the degree to which the data can be explained" and "the complexity of the model". This is why, as mentioned earlier, each step in the modeling process must be checked repeatedly. And if the degree of agreement with the data is comparable, the model that is easier to understand should basically be selected. Depth of understanding and modeling methods There are hierarchies in the mechanisms of phenomena, and the level you want to understand also affects the appropriate model to use. Taking time series data as an example, if you only need narrative "understanding", such as how many factors such as trends or cyclical changes exist, then use the time series model to observe the obtained parameters or inferred changes in variables. But if you want to know the mechanism that produces this dynamic behavior, you must choose to use modeling methods such as dynamic systems, reinforcement learning, or multi-body systems according to the problem.
Forum Role: Participant
Topics Started: 0
Replies Created: 0