Batch download images from Xiaohongshu
- Open Firefox developer tools.
- Search the HTML class name
change-picand delete it. - Press
C-S-cto re-pick the HTML element containing the image area. - Select the parent HTML node of the currently focused node, which is something like
<ul class="slide" data-v-f3680c6c="">. - Press
C-cto copy the HTML code of the node and save it to a file. -
Use a combination of several commands,
grep,sed,xargsandwget, to extract image URLs and download all of them.grep -P -o '(?<=url\().*?(?=\))' album.html | sed "s/"/\"/g" |sed "s/\/\//https:\/\//" | xargs wgetN.B. In the pipe-concatenated commands above,
grepuses the option-Pto enable Perl compatible regular expressions. Its option-owill output only the matched patterns. In the regular expression(?<=url\().*?(?=\)),(?<=url\()is a positive look-behind assertion, while(?=\))is a positive look-ahead assertion (reference)..*?matches any characters between a pair of brackets()in a non-greedy mode. The total effect of this regular expression is to extract the url from the string below.url("//ci.xiaohongshu.com/8f9e2cf5-ea58-ef7f-b3bb-65e8ede2798c?imageView2/2/w/1080/format/jpg")sedfirst replaces HTML quote"with"and then//withhttps://. Finally,xargspasses each line of extracted URL towgetwhich downloads the image.