New research by Meta AI develops Theseus, a library for an optimization technique termed differentiable nonlinear least squares (NLS). Researchers may quickly incorporate domain knowledge into AI frameworks using Theus, a PyTorch-based tool. It adds information to the design as a modular “optimization layer” and characterizes knowledge as an optimization issue. Separate from training data, this domain expertise can increase model accuracy. This method is useful for building models for datasets with nonlinear functions. For example, with Theseus, researchers can include a kinematics model as a layer while training a robotic arm to move to ensure a robot’s smooth motions.
Theseus is the first nonlinear optimization library that is independent of applications. Compared to Google’s C++ Ceres Solver, it is four times faster. To speed up computation and memory, Theseus provides batching, GPU acceleration, sparse solvers, and implicit differentiation.
Theseus combines the best elements of two well-known AI knowledge injection strategies. Before deep learning, robotics researchers used more straightforward AI optimization techniques such that the robotic systems execute commands by predicting the least amount of joint motion and energy consumption. This strategy was successful but stiff due to the application-specific optimization strategies. Although deep learning methods are more scalable, they need a lot of data and may produce useful but unstable solutions outside the training environment. Because Theseus is not application-specific, AI can develop more quickly by creating precise models for different tasks and circumstances.
Researchers use an appropriately selected loss function to train a deep learning model for a particular task. According to the researchers, each layer needs to be differentiable for backpropagation to update model weights and enable the flow of error information. Researchers face a trade-off because conventional optimization algorithms are not end-to-end differentiable. They either give up on optimization in favor of comprehensive deep learning tailored to the particular issue, losing the effectiveness and generality of optimization. The model might also be trained offline and added at the time of inference. The second method combines deep learning and prior knowledge, but its predictions might be inaccurate because the deep learning model is trained without the task-specific error function.
To integrate these strategies, Theseus converts optimization findings into a layer that can be added to any neural network architecture. As part of the end-to-end deep learning architecture, this enables researchers to fine-tune the final task loss using domain-specific knowledge.
The nonlinear function’s departure from the projected data is measured using NLS. Small value hints at the good fit of the function. In robotics and vision, NLS is utilized for mapping, estimating, planning, and control.
By incorporating nonlinear optimization into neural networks, Theseus makes NLS differentiable. A sum-of-weighted-squares objective function is defined by input tensors, and output tensors produce its minimum in contrast to neural layers that apply a nonlinear activation function element-wise and translates input tensors linearly. End-to-end gradient computation is preserved while differentiating through the optimizer.
As a result, models can encode domain knowledge and learn from task loss by including the optimizer and known priors in the deep learning training loop. For example, researchers may employ well-known robot kinematics in the optimizer to guarantee fluid robot motions; the deep learning model will extract the larger aim from perception or language instruction during training. Using the well-known kinematics model, researchers may construct the goal prediction model from beginning to end. Ultimately, the library allows encoding domain information in end-to-end AI models. Data economy and generalization are improved when known priors are combined with neural components.
Sparse solvers, automatic vectorization, batching, GPU acceleration, and implicit gradient computation are all supported by Theseus. Theseus outperforms solvers like Ceres (that only support sparsity) by enabling implicit differentiation, sparsity, auto diff, and GPU acceleration. On a common GPU, Theseus runs quicker and consumes less memory. When solving a group of challenging challenges, Theseus’ advance pass is up to four times faster than Ceres’. Better gradients are produced through implicit differentiation than by unrolling. In contrast to unrolling, implicit differentiation maintains a consistent memory and computation footprint as the number of optimization rounds increases, producing better gradients comparatively.
The team believes that their methodology will support an additional investigation into the function and evolution of structure in complex robot systems, end-to-end learning on such systems, and continuous learning throughout interactions with real-world objects.
This Article is written as a summary article by Marktechpost Staff based on the research paper 'Theseus: A Library for Differentiable Nonlinear Optimization'. All Credit For This Research Goes To Researchers on This Project. Checkout the paper, github link, project and tutorials. Please Don't Forget To Join Our ML Subreddit
Tanushree Shenwai is a consulting intern at MarktechPost. She is currently pursuing her B.Tech from the Indian Institute of Technology(IIT), Bhubaneswar. She is a Data Science enthusiast and has a keen interest in the scope of application of artificial intelligence in various fields. She is passionate about exploring the new advancements in technologies and their real-life application.