Use cURL from your Pwnbox (not the target machine) to obtain the source code of the “
https://www.inlanefreight.com
” website and filter all unique paths of that domain. Submit the number of these paths as the answer.
the right command it’s this
curl
https://www.inlanefreight.com
> htb.txt && cat htb.txt | tr " " “\n” | cut -d"‘" -f2 | cut -d’"’ -f2 | grep “
www.inlanefreight.com
” | sort -u | wc -l 2>/dev/null
first do a curl and redirect the output to htb.txt and(&&) use a cat to htb.txt and the rest is filter
if you don’t know what it’s a path in a domain:
For example, in the url
https://cloudflare.com/learning/,cloudflare.com
is the domain name, while https is the protocol and /learning/ is the path to a specific page on the website
function guesser(){
document.getElementById(“answer430”).value = x
document.getElementById(“btnAnswer430”).click()
setInterval(guesser,1000)
This shorter version worked for me:
curl
https://www.inlanefreight.com
| tr " " “\n” | cut -d"‘" -f2 | cut -d’"’ -f2 | grep
www.inlanefreight.com
| sort -u | wc -l
Thank you for the hint anyway. I was overwhelmed at first, but now it’s totally comprehensible.
If you are used to using regex, then you don’t need to use so many commands - grep can handle all the filtering for you:
curl https://www.inlanefreight.com/ | grep -Po "https://www.inlanefreight.com/[^'\"]*" | sort -u | wc -l
Could you clarify what you did on that grep filter? I am new to this, I would like to understand what [^'"]* this does.
Thanks!
Sure - but bear in mind that it’s not been covered in the material, I was just saying if you already know regex then you can use that instead.
[^'"]* means any character ([ ] lists characters) that isn’t (^ = isn’t) a ’ or a " . The * at the end means match if there are 0 or more of them.
So basically what that expression is doing is searching for
https://www.inlanefreight.com/
including all characters after until it reaches a ’ or a " (which would be the closing of the anchor link)
We need to check for both ’ and ", as some of the links are written:
<a href='https://www.inlanefreight.com/'>
and others are written
<a href="https://www.inlanefreight.com/">
It should be more like
grep -Po "https?://www.inlanefreight.com[^\"'?#]*"
shouldn’t it ?
The url path ends at the first
?
(query parameters) and
#
(fragment). Thus the correct answer should be one less.
(
https://www.inlanefreight.com/index.php/wp-json/oembed/1.0/embed
is present in the result with different query parameters)