It's 2020. SQLI and XSS are still some of the most wide-spread attack vectors in web applications. Powerful type systems like the one in have proven to be an effective tool that help preventing these problems.

· · Web · 5 · 7 · 8

@raichoo Oof, careful... Key is input validation and sanitization, on an abstract level. Moreover, reflection is not the only source for XSS. In practice, not to concatenate unescaped input blindly is what matters mostly imho. You can very much do that in FP again. ;)

Here's how I teach handlig RPC:
Input Validation
Output Validation

And yes, you need input validation on multiple layers, especially when you mix data in a single request.

@CyReVolt Type systems can be used to ensure that certain measures have been taken. Take things like unescaped input: That can be marked in the type system causing compile-time errors when doing something like e.g. concatenating it with data that later gets evaluated by some other system like a database.

@CyReVolt You can use the same trick to mark sensitive data that should under no circumstances take a code path that causes this data to be presented to a certain endpoint (e.g. un-anonymized data).

@CyReVolt Sadly techniques like these don't seem to be that wide-spread in "enterprise code" given how cumbersome working with a lot of industry-level type systems is. "Just pass strings everywhere, it'll be fine!" :thisisfine:

@raichoo I know, I'm deeply in there. Yes, "can" is the word. In more theoretical computer science, we're looking for what's called a "decider" to determine input validity - yes, pattern matching is one way of looking at it, just like PDAs (push-down automata) in a more abstract way (which, yes, pattern matching can be an implementation of). For more advanced attacks, we need not just take specific fields into account, but also combinations thereof. :-) Validity isn't transitive.

@CyReVolt It would be great if this discussion would not go like "hey you don't seem to know anything, let me explain the world to you". Sorry but this is very patronizing.

@raichoo Please don't get me wrong here. There is no need for any fight like that. I do think that type systems are a good path to follow. I am just not sold on "FP will fix it", so I mainly found that tag misleading. Anyway, not the core topic. So to come back to the original claim thusly: "have proven" could use some backing; do you know some work/papers on that? The hard part for most people like me is to "sell" such things to business. Any support counts, and they like numbers and facts. :-)

@CyReVolt I'm not actually trying to sell FP here. Maybe that tag rubbed you the wrong way (I consider FP more of a style rather than a language feature). There is a some research on the topic if I'm not mistaken Benjamin Pierce is has written some stuff on it but I'd have to look that up. Also the langsec in general have written a whole fudgeload on this stuff. But yeah, "selling" this to the Java-Honchos is the tricky part.

@raichoo Right, I guess the "FP will solve it" thing is sort of something I've come across often lately, so that may have created some bias on me. Sorry. :/

Thanks for the tip! From my experience in the industry, spreading knowledge is the best thing to do. Business won't invest in it because it doesn't create value, so we have to do it ourselves to stand up to our own values. Software development has become a diplomatic topic. =)

@CyReVolt Yes, FP is often presented as a panacea. It's harmful. Also seen a lot of "100% solutionists". "If it does not solve 100% it's not worth it." That's especially true in the industry. People who often argue in favor of "evolution vs revolution" seem to be surprisingly opposed to gradual improvements.

@raichoo Yea it sounds like it doesn't fit together. Measuring impact of change is hard, so maybe that's why many people try to reduce it to something binary and because there is no 100% solution we're sort of stuck. That's probably why we as developers are asked by technical management to provide data backing our claims of improvements. Enterprise is mostly following a top-down approach, so feedback is hardly heard. I'm not sure agile solves it, but definitely think there are valid points.

@CyReVolt It's weird and I'm still trying to wrap my head around what's going on there. For some reason the industry does not seem to be opposed to be switching JS frameworks during every new moon phase in the past decade. Impact measurement is hard, but at my old job we actually implemented complex web applications in Haskell that severely outperformed known approaches in every matter and management still opposed it. Even with proof-by-construction.

@hopeless @raichoo If used correctly, yes. I've seen too many sites applying the most generic CSP policy allowing the content to work and not doing much to prevent XSS.

@wasamasa @raichoo yep if it's all munged together, you actually have to do the work to segregate the scripts and css into files to set the default to nothing allowed but those to get the gains. If you still bring in a dozen remote third-party scripts, that you don't control the content of, it's questionable how secure the result is. But if maxxed out, it brings a major upgrade in architecture and maintainability along with the security.

@raichoo but how? I mean in the end it is a question of how you treat user input.

@esopriester Mark user input in the type system as "unescaped". Have an escape function taking unescaped input transforming it to something of type "escaped". Only accept things of type "escaped" when splicing that data in. Used this approach a ton of times, it always works. Plenty of Haskell libraries are already doing that e.g. blaze-html.

@raichoo well it doesn't help to prevent the problems by its mere existence. You still have to know how to use the language features, which seem to make such things a lot easier and transparent -

From my experience most SQLI etc problems are there because the code was written by developers who were not aware of owasp top 10 or didn't give a shit. They will deliver crap with every language they use

@esopriester Those developers should not be in charge of writing the libraries that prevent this. It's also an organizational problem. Anyway, most of them are using webframeworks and are not directly writing library code. That's how you usually tackle it, present them with an API that causes compile time errors when they are doing stuff like this. They are merely consumers of libraries.

@esopriester Libraries should impose certain invariants via their API. Most libraries I've seen are passing around Strings as if there were no tomorrow. You can't impose anything using that approach. You also can't prevent sloppy users to work around these restrictions, but the code that produces will be very obvious to spot in a code review (heck even `grep(1)` could do that).

@esopriester It' not a silver bullet but it still prevents a whole lot of mistakes from being made. For some reason people dismiss this approach though because it does not fix 100% which I consider impossible because you will always find developers blowtorch their way through any safety restrictions. I still believe that most developers are not keen on shooting themselves in the foot if they can help it. Supporting those should be the priority.

@esopriester @raichoo Features by itself seldom do anything, esp. when they're poorly understood or even ignored. I wouldn't worry about those, who can't be bothered to do the bare minimum. Sucks if you are stuck in a job with people like that, i know, but in my experience there is nothing to bring them around. I'm more concerned about finding the right tools, methods and workflows that let me write the best code i can, learn from and teach to those, who have some curiosity left in them. you treat user input as a possible serialisation of the values that your program works with, and you start by parsing that input into either those values or an error. Drawing a clear line between internal and external representation would eliminate about 70% of the owasp top 10

@raichoo It doesn't necessarily have to be the type system. This blog post identifies the problem as trying to cram an AST into a string, thereby disregarding the context that would allow you to tell how user input has to be handled:

@raichoo if a code generator is done right, you also never have to see SQL or HTML again, which is a very easy win

@nodefunallowed I would disagree with the "not seeing SQL again is a win". To me the biggest issue with code generators, especially for SQL, is that they mostly only generate SQL98 level code. Databases have evolved quite a lot since then, with pretty much every generator I had to use I had to fall back to handwriting SQL at some point since the generator simply could not generate the needed code. It works pretty well for simple CRUD though.

@raichoo I see. Haven't had to do any complicated or implementation-dependent SQL stuff before.

@nodefunallowed You can get surprisingly far with generators. But once the queries are getting more complicated they tend to churn out stuff that's getting harder and harder to debug and optimize. Especially if you need to find that one part in your code that is tanking the whole execution plan and then try to convince the generator to produce the SQL you already know you need (if that's even possible).

Sign in to participate in the conversation – a Fediverse instance for & by the Chaos community