# PolymathPlus NLP Specification ## What This Program Type Solves NLP programs in PolymathPlus solve nonlinear optimization problems. This specification covers: - Non-LSQ optimization: table-free programs with `min:` - LSQ optimization: table-based least-squares programs with `fit:` - Optional explicit helper equations - Optional equality and inequality constraints - Initial guesses and lower/upper bounds for model variables Primary documentation objective: Help students and engineers describe a nonlinear optimization or constrained fitting problem in natural language, then have an AI generate valid PolymathPlus NLP source. Common rules that always apply: - `#` starts a comment. Comments are ignored by validation. - Blank lines are allowed and ignored. - Variable names must start with a letter and then use only letters, digits, or underscore. - Avoid built-in function names or conditional keywords as variable names. Examples to avoid include `sin`, `sqrt`, `log`, `if`, `then`, and `else`. - Keep one consistent program style. Do not mix NLP with DEQ, LEQ, NLE, or regression command formats. - A program must contain exactly one main statement: either `min:` or `fit:`. ### Math expression syntax NLP expressions use the shared PolymathPlus expression checker. Supported expression features include: - numeric literals, including decimal and scientific notation - arithmetic operators `+`, `-`, `*`, `/`, and `^` - parentheses - built-in math functions such as `sin`, `cos`, `tan`, `sqrt`, `log`, `ln`, `exp`, and `arctan` - conditional expressions using `if ... then ... else ...` - comparison/logical operators inside conditional expressions Current precheck behavior: - Some implicit multiplication forms are accepted, including `2x`, `2 x`, and adjacent variable/function/parenthesis forms. - For generated source, prefer explicit `*` because it is clearer and less likely to be misread by users or future tooling. ## User Rules For Writing A Valid NLP Program ### 1. Choose the NLP style Use `min:` for nonlinear optimization without a data table: ```txt min: objective_expression ``` Use `fit:` for least-squares parameter estimation against a data table: ```txt fit: measured_variable = model_expression ``` Current precheck behavior: - `fit:` without a table is invalid. - If no recognized main statement is present, table presence affects the missing-statement message: with a table, precheck asks for `fit:`; without a table, it asks for `min:`. - Current precheck can accept a table-plus-`min:` program when the variables resolve. Treat this as validator leniency, not recommended modeling syntax. Recommended modeling practice: - Use `min:` for objective functions written directly by the modeler. - Use `fit:` when the objective is the sum of squared residuals between measured table values and a model expression. Not recommended table with `min:`: ```polymathplus [ x y 1 2 2 4 3 6 ] min: a*x a| 1 0 10 ``` Recommended LSQ form: ```polymathplus [ x y 1 2 2 4 3 6 ] fit: y = a*x a| 1 0 10 ``` Invalid `fit:` without a table: ```polymathplus fit: y = a*x a| 1 0 10 x| 1 0 5 ``` Repair: ```polymathplus [ x y 1 2 2 4 3 6 ] fit: y = a*x a| 1 0 10 ``` ### 2. Main statement rules Non-LSQ syntax: ```txt min: expression ``` LSQ syntax: ```txt fit: y = expression ``` Rules: - Exactly one main statement is allowed. - The `min:` expression must be a valid math expression. - The `fit:` right-hand expression must be a valid math expression. - In `fit: y = expression`, the dependent variable `y` must not appear in the right-hand expression. - Variables used in the main statement must be known from the table, explicit equations, or guess/bounds lines. Invalid duplicate main statements: ```polymathplus min: x^2 + y^2 min: (x - 1)^2 + (y - 1)^2 x| 0 -5 5 y| 0 -5 5 ``` Repair: ```polymathplus min: x^2 + y^2 x| 0 -5 5 y| 0 -5 5 ``` Invalid `fit:` dependent variable on the right side: ```polymathplus [ x y 1 2 2 4 3 8 4 16 ] fit: y = a*x + y a| 1 0 10 ``` Repair: ```polymathplus [ x y 1 2 2 4 3 8 4 16 ] fit: y = a*x a| 1 0 10 ``` ### 3. What LSQ minimizes In NLP, `fit:` means least-squares parameter estimation. For a program of the form: ```txt fit: y = f(x, parameters) ``` with table rows `(x_i, y_i)`, the intended modeling pattern is that the solver uses the model parameters as decision variables and minimizes: ```txt sum_i (y_i - f(x_i, parameters))^2 ``` If constraints are included, the problem becomes constrained least squares: - Objective: minimize squared residuals. - Constraints: satisfy `= 0` constraints, `> 0` constraints, and variable bounds. Recommended modeling practice: - The `fit:` left-hand variable should be a measured table column. - Include at least one fitted model parameter with a guess/bounds line. Current precheck behavior: - The `fit:` left-hand variable only has to be known; current precheck can accept it as a table variable, explicit variable, or guess/bounds variable. - Current NLP precheck can accept a `fit:` expression with no fitted parameters. Treat this as validator leniency, not useful modeling syntax. ### 4. Data table rules for `fit:` The LSQ style requires one table block enclosed by `[` and `]`. Table format: - First non-comment, non-blank line inside the table is the header. - Header tokens are variable names separated by spaces. - Header variable names must be unique. - Each numeric row must have exactly the same number of columns as the header. - Numeric cells must be valid numeric literals, including decimal and scientific notation. - Comments and blank lines inside the table are ignored. - Only one table block is allowed. Recommended modeling practice: - Include at least one numeric data row. - For `fit: y = expression`, make `y` a table header variable. Current precheck behavior: - NLP precheck may accept a header-only table. Treat this as validator leniency; it is not a meaningful fit dataset. Valid LSQ table: ```polymathplus [ x y 1.309 2.138 1.471 3.421 1.490 3.597 1.565 4.340 1.611 4.882 1.680 5.660 ] fit: y = b1*x^b2 b1| 0.7 0 2 b2| 4.0 0 10 ``` Invalid row width: ```polymathplus [ x y 1.309 2.138 1.471 1.490 3.597 ] fit: y = b1*x^b2 b1| 0.7 0 2 b2| 4.0 0 10 ``` Repair: ```polymathplus [ x y 1.309 2.138 1.471 3.421 1.490 3.597 ] fit: y = b1*x^b2 b1| 0.7 0 2 b2| 4.0 0 10 ``` Invalid duplicate header variable: ```polymathplus [ t y t 0 1 0 1 2 1 2 4 2 ] fit: y = a*exp(b*t) a| 1 0 10 b| 0.5 0 3 ``` Repair: ```polymathplus [ t y 0 1 1 2 2 4 ] fit: y = a*exp(b*t) a| 1 0 10 b| 0.5 0 3 ``` ### 5. Explicit variable rules Explicit equations define helper variables: ```txt helper = expression ``` Rules: - Explicit variable names must be unique. - Explicit expressions must be valid math expressions. - Circular and self-referencing explicit definitions are invalid. - A variable cannot be both an explicit variable and a model variable with a guess/bounds line. - Variables used in explicit expressions must be known from the table, other explicit variables, or guess/bounds lines. - Explicit variables may appear before or after the main statement, as long as the dependency graph is valid. Valid helper equations: ```polymathplus pi_val = 4*arctan(1) volume = 100 min: pi_val*r^2*h pi_val*r^2*h - volume = 0 r| 2 0.1 10 h| 5 0.1 20 ``` Invalid circular helpers: ```polymathplus min: x^2 + a a = b + 1 b = a + 1 x| 1 -5 5 ``` Repair: ```polymathplus min: x^2 + a a = b + 1 b = 2 x| 1 -5 5 ``` Invalid explicit/model overlap: ```polymathplus min: x^2 + scale scale = 2 scale| 1 0 10 x| 1 -5 5 ``` Repair: ```polymathplus min: x^2 + scale scale = 2 x| 1 -5 5 ``` ### 6. Initial guess and bound rules NLP model variables are defined using pipe syntax: ```txt var| guess lower_bound upper_bound ``` Rules: - Each model variable can appear only once. - `guess`, `lower_bound`, and `upper_bound` must be numeric literals. - Bounds must satisfy `lower_bound <= upper_bound`. - The initial guess must satisfy `lower_bound <= guess <= upper_bound`. - A guess/bounds variable must be used in the main statement, an explicit expression, or a constraint. - A variable used in the main statement must be defined as a table variable, explicit variable, or model variable. Current precheck behavior: - The accepted guess/bounds pattern is lenient about trailing text after the three numbers. For generated source, write only `var| guess lower_bound upper_bound`. Valid: ```polymathplus min: (x - 2)^2 + (y - 3)^2 x| 1.5 0 10 y| 2.5 0 10 ``` Invalid missing guess for `x`: ```polymathplus min: (x - 2)^2 + (y - 3)^2 y| 2.5 0 10 ``` Repair: ```polymathplus min: (x - 2)^2 + (y - 3)^2 x| 1.5 0 10 y| 2.5 0 10 ``` Invalid duplicate guess: ```polymathplus min: x^2 + y^2 x| 1 -5 5 x| 2 -5 5 y| 1 -5 5 ``` Repair: ```polymathplus min: x^2 + y^2 x| 1 -5 5 y| 1 -5 5 ``` Invalid bound order: ```polymathplus min: (T - 350)^2 T| 340 900 150 ``` Repair: ```polymathplus min: (T - 350)^2 T| 340 150 900 ``` Invalid unused model variable: ```polymathplus min: x^2 x| 1 -5 5 y| 1 -5 5 ``` Repair: ```polymathplus min: x^2 x| 1 -5 5 ``` ### 7. Constraint rules Allowed constraint formats: ```txt expression = 0 expression > 0 ``` Rules: - Equality constraints must be written as `expression = 0`. - Inequality constraints must be written as `expression > 0`. - `>=` and `<=` are not supported by NLP precheck. - Inequality right-hand side must be exactly `0`. - To express `a <= b`, write `b - a > 0`. - To express `a >= b`, write `a - b > 0`. - Constraint expressions may use known variables only. - If constraining a model variable to equal zero, avoid a bare line like `x = 0`; precheck reads that as an explicit assignment before constraint parsing. Write `x - 0 = 0`. Valid constrained optimization: ```polymathplus min: (x - 1)^2 + (y - 1)^2 x + y - 1 = 0 1.5 - (x^2 + y^2) > 0 x| 0.5 -2 2 y| 0.5 -2 2 ``` Invalid unsupported `>=`: ```polymathplus min: (x - 1)^2 + (y - 1)^2 x + y >= 1 x| 0.5 -2 2 y| 0.5 -2 2 ``` Repair: ```polymathplus min: (x - 1)^2 + (y - 1)^2 x + y - 1 > 0 x| 0.5 -2 2 y| 0.5 -2 2 ``` Invalid nonzero inequality right side: ```polymathplus min: (x - 1)^2 + (y - 1)^2 1.5 - (x^2 + y^2) > 3 x| 0.5 -2 2 y| 0.5 -2 2 ``` Repair: ```polymathplus min: (x - 1)^2 + (y - 1)^2 -1.5 - (x^2 + y^2) > 0 x| 0.5 -2 2 y| 0.5 -2 2 ``` ### 8. Unknown and unrecognized lines Every non-comment, non-blank line outside a table must be one of: - a main statement - an explicit equation - a guess/bounds line - a valid equality constraint - a valid inequality constraint - a table delimiter Invalid standalone expression: ```polymathplus min: (T - 350)^2 + 0.1*W^2 T + W T| 340 300 450 W| 200 0 400 ``` Repair: ```polymathplus min: (T - 350)^2 + 0.1*W^2 T + W - 540 = 0 T| 340 300 450 W| 200 0 400 ``` Invalid unknown variable: ```polymathplus min: (x - 2)^2 + (y - 3)^2 2 - x*t > 0 x| 1 0 10 y| 1 0 10 ``` Repair: ```polymathplus min: (x - 2)^2 + (y - 3)^2 2 - x*y > 0 x| 1 0 10 y| 1 0 10 ``` ## Solver Options Metadata Solver metadata can be added as comment directives: ```polymathplus #@NlpSolver = gr #@Trace = true ``` Common solver names used by solver-side code are: - `gr`: GRG - `al`: Augmented Lagrangian - `ba`: BANLP Current precheck behavior: - These lines are comments/directives for precheck. - NLP precheck does not validate solver option values. - Solver components, not precheck, interpret the metadata. ## Practical Modeling Checklist Before generating or solving an NLP program: - Decide whether the problem is table-free optimization (`min:`) or table-based least squares (`fit:`). - For `fit:`, include exactly one valid data table and keep the dependent variable off the right-hand side. - Write the objective or model expression in valid math syntax. - Add explicit helper equations only when they simplify the model. - Add one guess/bounds line for each decision variable. - Keep every guess inside its bounds. - Convert constraints to `= 0` or `> 0`. - Ensure every variable is defined and used. - Remove duplicate definitions and circular helper equations. ## Examples: Valid Programs ### A. Unconstrained nonlinear optimization ```polymathplus min: (1.5 - x + x*y)^2 + (2.25 - x + x*y^2)^2 + (2.625 - x + x*y^3)^2 x| 1 -4.5 4.5 y| 1 -4.5 4.5 ``` ### B. Constrained nonlinear optimization ```polymathplus min: x1^2 + x2^2 + 2*x3^2 + x4^2 - 5*x1 - 5*x2 - 21*x3 + 7*x4 2*x1^2 + x2^2 + x3^2 + 2*x1 - x2 - x4 - 5 = 0 x1^2 + x2^2 + x3^2 - x1 + x2 - x4 - 8 = 0 x1^2 + 2*x2^2 + x3^2 + 2*x4^2 + x1 + x4 - 10 = 0 x1| 0 -50 50 x2| 1 -50 50 x3| 2 -50 50 x4| -1 -50 50 ``` ### C. Engineering design optimization ```polymathplus pi_val = 4*arctan(1) target_volume = 100 r_min = 0.1 r_max = 5 h_min = 0.1 h_max = 10 min: radius^2 + radius*height pi_val*radius^2*height - target_volume = 0 radius - r_min > 0 r_max - radius > 0 height - h_min > 0 h_max - height > 0 radius| 2 0 30 height| 4 0 30 ``` ### D. Least-squares fit ```polymathplus [ x y 1.309 2.138 1.471 3.421 1.490 3.597 1.565 4.340 1.611 4.882 1.680 5.660 ] fit: y = b1*x^b2 b1| 0.7 0 2 b2| 4.0 0 10 ``` ### E. Constrained least-squares fit ```polymathplus [ x y 0.0 0.91 0.5 1.48 1.0 2.40 1.5 3.73 2.0 5.00 2.5 6.27 3.0 7.60 3.5 8.52 ] fit: y = level/(1 + exp(-rate*(x - center))) level - 0 > 0 rate - 0 > 0 level| 8.0 0 20 rate| 0.8 0 5 center| 1.5 -1 5 ``` ## Verification Notes The rules and examples in this repo-local hardened spec were checked against: - `C:\dev\js\solver_precheck\solver_precheck.js` - NLP valid and invalid cases from `C:\dev\js\solver_precheck\solver_precheckTests.js` The examples were cleaned for documentation use, including simplified names and removal of test-only comments.