python-plus-plus/plans/do-while.md

60 lines
8.4 KiB
Markdown
Raw Normal View History

# Do While
I have just noticed something with the grammar of this language. It is that I use do-while, without the while for blocks of code within while/for loops and the branches of if conditions. But this also means that you can't put a while loop after any of them, because it will interpret the while condition as part of the do block of the loop/if condition. Consider the following:
```
if <condition1> do {
<block1>
}
while <condition2> do {
<block2>
2024-08-11 13:07:05 +10:00
}
```
In this example, if `condition1` is true, then `block1` will be executed at least once, and will be repeated while `condition2` is true, and `block2` will be executed once, no matter whether or not `condition2` is true, or maybe more times if you have another while loop after it as well.
## Correction
Ok. Um Actually..., I just checked and this is not true. It will instead just cause a syntax error, claiming that the `do` keyword in the `while` loop after the `if` should be a semicolon, because you need to have a semicolon after the `while` condition of a do-while loop. If you then put a semicolon before the `do`, then it will have the behaviour described above, but you probably shouldn't do that. Also I guess it makes it a bit more obvious what is happening, because from that you might infer that the `while` condition is being interpreted as part of the `do` block. I guess for `if`s there is a remedy, which is to put ` else {}` at the end, which would ensure that the parser isolates the `do` block from the `while` condition. But this is not a good solution, and this is definitely a problem that I will need to solve.
I remember the original reason I decided to do this hack with `do` blocks was because I needed to have some keyword that differentiated between when you have a struct and not, because the syntax for structs is `MyStruct{arg1=val1, arg2=val2, ...}`.
## Paths forward
From what I see, there are a few ways that I can solve this.
1. Use a keyword to indicate the end of the condition.
For `if` blocks this can obviously be done with a `then` keyword, but the keyword is not so obvious for `while` and `for` loops. I could keep the `do` keyword, but have these loops expliciately check for the presense of the `do` keyword before continuing. Currently the way they work is that they parse the `while`/`for` keyword, then they parse the condition as an expression, then they parse just one statement. You can put a block using a `do` block because the entire `do` block counts as one statement. The `do` block, like these other loops, also only parses one statement for its body. But putting a sequence of statements within `{` and `}` is how it counts as one statement to the `do` block. The reason you can't just directly do
```
if <condition> {
<block>
}
```
is because the code that parses the condition will attempt to parse the `{` as the start of a struct instantiation.
2. Modify when struct instatiation is valid. Currently, you can use `{` and `}` to start a struct instantiation after any expression, and it is only checked at runtime if the value of the expression is a type object that is a struct. I could make this a check in the grammar, by requiring the expression to be a single identifier token, but I still don't think this would work, because what if the identifer token is a boolean variable? I want to be able to do:
```
if flag {
<block>
}
```
But the parser will attempt to make flag into the identifier naming the type for a struct. So I don't think this will work. The reason that the check on whether or not the expression is only at runtime is because when I originally wrote it, I think I was anticipating some kind of expressions that return new types, and I wanted you to be allowed to instantiate structs that you construct in this way, but looking back, I don't think that that was a good idea. But in the runtime code, the only thing that I can really do is make the AST node for a struct instantiation require a single identifier, instead of a whole expression, this doesn't solve the issue of how I am going to differentiate between
```
if flag {
<block>
}
```
and
```
if MyStruct {
arg1=val1,
arg2=val2,
...
} {
<block>
}
```
where you are checking the truthiness of your instance of `MyStruct`.
3. I could just require parentheses around the condition. Honestly, though, I don't really want to, because I just like the aesthetic of not needing parentheses around conditions. So I probably won't do this.
2024-08-11 13:07:05 +10:00
4. Change how struct instantiation works syntactically. I could just use `(` and `)` instead of `{` and `}`, so `MyStruct(arg1=val1, arg2=val2, ...)`. But honestly, I don't like this for the same reason as the previous fix. However, just as I am writing this, I think I have come up with the solution. `.`. A single period would seperate the type and the open curly brace. So `MyStruct.{arg1=val1, arg2=val2, ...}`. Honestlty, I think that this will be what I go with. It also allows me to solve my problem concerning the types of array literals. I'm currently doing some very weird stuff with emtpy variable types, and using the first element to indicate the type of the rest, which isn't always ideal if the first element is a subtype of the type you want for the array, and it's just a whole ordeal, which I think that this syntax can solve. I can do something like Jai (I think it does this? I haven't checked) where you prepend the array with the type, and then a period, so `int.[0, 1, 2, 3]`. This wpill also allow you to indicate whether or not integer literals should be interpreted as floats for example. I think that if you tried to create an array of floats using integer literals, it wouldn't work, and you would have to do something like putting a `.0` at the end of all the integers. Back on the topic of the period, this syntax is basically already being used for Enums, where you have to specify the type and then the member, and I had always been planning on allowing the type to be not provided, if the context was enough to deduce what enum type was being returned, for example, if you are in a function the language can use the return type to deduce the type. I can now also extend this idea to structs and arrays, where you don't need to specify the type if it is determined by something like the return type of a function or something else. I guess there would just a few rinckles to iron out depending on what kind of expression is allowed to the left of the period. I will probably have to make type expressions just regular expressions (i.e. normal expressions, not RegEx), and then have it check when parsing if the expression to the left of the period is a type expression. The same check will also have to be done in type declarations in function arguments and what-not. I think out of all the paths forward, this will be what I will do, if I ever come back to this language. Ultimately, though, this is just syntax bikeshedding, and there are certainly more important things that I should be focusing on at the moment if I want to get this language into a functional state.
2024-08-11 13:07:05 +10:00
I haven't worked on this lanaguage in like half a year or so, and I only really came back to check it out because I started learning to use emacs, and I just wanted to explore my old projects through emacs. I will probably not work on it soon, but once everything that is taking up my time has been dealt with, I plan on working on the language further. As you can probably tell from the opening of this, I wrote this without really checking whether or not what I wrote was true. This problem with `do`-`while` just came to me as a random thought, and I just decided to write this markdown file as some documentation for my future self in case I ever came back to this langauge. I think it was certainly productive, because throughout the course of writing this, I think I came across something even better for the langauge, inspired by Jai. I originally wanted this language to be at the same level as python, as this langauge came out of my frustrations with python, but on the whole I think I now want to eventually make this language closer to c/rust/jai, but I will need to do a lot of work to get there. The current state of the project (which, like I said, has been untouched for half a year) is currently nowhere near that, and is, like originally planned, closer to where python is, especially with how it deals with dicts and lists. There is currently no memory management at all, and you can't even decide the size of ints. So yeah, this project has a long way to go.