File :-(, x, )
Anonymous
Dear /w/:
I have over 18,000 wallpapers.

What should I do now? This insane amount of /w/ class wallpapers should be good for something.
>> Anonymous
You, sir, are what we like to call a loser. Your life will never amount to anything because you have spent it on 4chan collecting wallpapers. You have so many walls that you could change them out every day of your life and still not get through them all. Way to go.

jk, you should rar and megaupload them
>> Anonymous
Time to ascend.
>> Wing Zero Anonymous
     File :-(, x)
>> Anonymous
     File :-(, x)
>> Anonymous
>>559981
Nah. I made a script to rip all images from a board. I run it about once a week or so to maximize content freshness and then filter through deleting the crap, duplicates, etc.

Was toying with the possibility of having a webserver setup to search, add tags to images, etc for this crap. I kinda lazy though and suck at teh web programming.
>> Anonymous
OVER 9000 dollars says 99% of them are tiny-ass postcard sized postage stamp resolutions of 1280x1024 or smaller.
>> Anonymous
anyone know of an application that detects similar/same images?
>> Anonymous
>>559986
I still think you should rar and megaupload them
>>559993
I do not think that such a program exists
>> Anonymous
>>559991
Well, post better quality images then...

Anyways. 1/3 of them are 1024/768. Another 1/3 are 1280x800-1280x1024. The last 1/3 are all above 1280x1024.

>>559993
I use iView Media Pro (which is now Microsoft Media Expression). It allows you to search by exact size (finding exact duplicates), color makeup (if you going for a particular desktop color scheme), and with tolerances (good for finding resized duplicates), and a variety of other search options.

Good luck finding a download for it though.
>> Anonymous
18000? that's like... 2 9000s.
>> Anonymous
>>559996
Think I might. I'll do it in RS though since I have a premium account. Lol.
>> Anonymous
>>560002
Nope, more like 18500~. It's OVER 2 9000s!
>> Anonymous
     File :-(, x)
>>559993
>>559986

Gentlemen, your solution is pImgDB. It automatically skips 100% matching images from being imported, also it can find detexts etc afterwards. It happens to have a webserver built-in, although that feature isn't very powerful.

What's amazing with it is its tagging features. Take one-click tagging for instance. You assign tags to keys on your keyboard, and simply hit the correct key at each image. If there are large groups of images that should be tagged the same, just select them before you hit the tagging key.

Before you dump the images somewhere, remember to write emtags to the images. If someone download and import an image later on, they will automatically be loaded with the tags you entered. Embedded tags in other words. It's awesome.

It's got way more features, like built-in 4chan uploader, but there's not enough room to list them all here. Look at the changelog for in-depth info.

http://praetox.com/site/software/pimgdb.html
>> Anonymous
>>560005
I don't think he's gonna tag all the images.
>> Anonymous
>>560005
why wasn't I told about this before?
>> Anonymous
>>560009
You're probably right, but it'd still ease the sorting process. He could use the oneclick tagging thing to tag "keep" or "delete", then use "Mass edit" to delete all the shitty images.
Besides, if he put the images in a proper folder structure, it can automatically tag the images based on their locations. That way everything would be a quick ctrl-f away.
>> Anonymous
Shit's still relatively new. Actually, it was us that gave the dev the idea back in the end of april. First version was around july or so, but it didn't turn completely epic before august.
Spread the word! As more people use it, more images on 4chan will already be tagged as we download them. Automatic image organizing is upon us.
>> Anonymous
>>560005
aw lawdy , dat some Lake Lucerne
>> Anonymous
>>559986
kekeke, I already implemented such a system (albeit without tagging; tags are for fags!). Been running it since early May (93,000 images, 220,000 posts).

https://suigintou.desudesudesu.org/4scrape

Also, I didn't know pImgDb had a centralized system for globally associating (what I assume to be) an image's MD5 with a set of tags. That's a pretty cool idea. (or does it just embed the tagging information within the image?). Anyway, sage for self-promotion.
>> Anonymous
>>560018
Yours is decent. Needs to be redone though. The entire interface and feature set is balls.
>> Anonymous
>>560005
>100% matching
...uh, md5 them and compare, perhaps?
>> Anonymous
>>560020
If you have specific feature/change requests (rather than "it sucks donkey dicks", which doesn't really tell me what issues you have with it), make them and I'll consider implementing them.

Please leave any such advice on the 4scrape comments page so we can stop cluttering up the OP's thread with my stupid shit :3

>>560022
MD5 is actually a terrible way to compare two images, simply because extremely minor changes (re-saving a JPEG, changing some EXIF data, different resolutions/aspect ratios of the same image, etc) result in a different hash value. What most image organizers do (pImgDb included, I think) is decompose the image such that minor changes (and even things like detexts) are much less noticable so two versions of the same image are flagged as similar.

The algorithm 4scrape uses is to divide the image up into 4 quadrants then find the average pixel value of each quadrant (which produces a vector of 4 pixels * 3 color channels = 12 numbers). To compare two images for similarity, each pair of corresponding dimensions are compared; if the threshold for a single dimension exceeds a threshold, it considers it not a match; if the sum of the absolute values of all the differences exceeds a different threshold, the image is again discarded.

There's all kinds of academic papers on image comparison (it's basically machine vision, easy mode) and there's all kinds of crazy ways to implement it. I chose a method which was shit, but easy to implement and relatively easy to compare large amounts of images (because it can cache the expensive operation - decomposition of large images).

But yeah. You've got a really nice set of data, OP, and there's all kinds of cool things you can do with it (with regards to image analysis). Even if you're not a technical person, you can use it to do artsy shit like image mosaics, or raid a computer lab and set each monitor with a random wallpaper or something. Use your imagination :3
>> Anonymous
>>560027
I'm looking to improve my script.

Question. Are you scraping first or downloading image first and then check if it's a duplicate? Or are you downloading, check binary data, compare and write if it's not a duplicate?
>> Anonymous
>>559993
Visipics
>> Anonymous
http://www.watchemma.com/?id=a95784cabb96ba50c86ca1fb0b521ac1
>> Anonymous
fuck rapidshare make a torrent
>> Anonymous
>>560005
Your program is great, but I have a question...

How do I add new images to the database? Say if I were saving an image from /w/, how would I add that to the database and quickly tag it?
>> Anonymous
>>560072
Also, to clarify the question a bit:

If I have a pre-existing database, which is say, /w/. If I want to save a new image and add it to /w/, how do I do this?
>> Anonymous
>>560035
First, it downloads all pages of threads and parses each page to find a list of thread IDs and the most recent post in each thread.

For each thread, it checks the current most recent post against the most recent post it's seen -- if there are new posts in the thread, it downloads the thread page, otherwise it moves on.

For each thread page it fetches, it parses all of the posts. For each post, it checks to see if the post is already in the database; if it is, the post is discarded, otherwise, the post image is downloaded.

For each image downloaded, it computes the MD5 of the data to see if there's a hash match. If so, it discards the downloaded image and uses a reference to the old one instead. If a new image is added, it then crunches the image to determine random statistical stuff (finding similar images, average color; a friend of mine wrote a script to find the N most significant colors in an image that I've been meaning to integrate, etc).

The idea though, is to download as little as necessary :3
>> Anonymous
you should number them all in order of preference
>> Anonymous
>>559976
>I have over 18,000 wallpapers

Buy 18,000 monitors and build a house out of them that displays your wallpapers on the outside of the house.

Oh, and don;t forget the security system or the niggers will teal your monitors.
>> Anonymous
>>559976
>I have over 18,000 wallpapers

Buy 18,000 monitors and build a house out of them that displays your wallpapers on the outside of the house.

Oh, and don't forget the security system or the niggers will steal your monitors.
>> Anonymous
u should share some or rapidshare
>> Anonymous
>>560084

That would be fucking AWESOME!

But it would probably piss off your neighbors as well.

But still fucking AWESOME!
>> Anonymous
>>560077
Ah, that's what I pretty much do. I feel it's such a waste thought to download and find out it's a duplicate of something past.
>> Anonymous
     File :-(, x)
>>560022
The initial import only checks the image md5 (not the file md5, mind you). This means that it would still import resaves as .jpg and resizes, but it will stop files with changed metadata (tags) and various .png versions.

>>560027
pImgDB basically does the same thing when it comes to actual image comparison, except with 8x8 pixels. There's no tag database as of yet, but I've thought about it - right now it just embeds the tags into the image.
>Sage for self-promotion
Not something I'd usually do, but this seemed like a golden opportunity. Cheers!

>>560072
While an "import from website" feature is kind of implemented, I'd recommend you to do the importing in bigger batches. To import one or more images, just drag/drop them into the app. Remember to use the {fname, {1} {2} etc. parameters if they're already sorted into folders.

Oh, and I stole your cooliris idea. Hope you don't mind.
>> Anonymous
>>560216
Ideas are made to be shared; it wasn't mine in the first place :3

Cooliris is fucking awesome <3
>> Anonymous
>>560264
True that, but isn't it somewhat of a bandwidth raep for you? =P
>> Anonymous
>>560320
Not like there's anything I can do about it.
>> Anonymous
>>560344
Well kinda, as you could just... you know, remove it again. I guess that wouldn't be very well accepted though, and it sounds like you've got quite the connection at any rate. Did you move the sever yet btw?
>> Anonymous
>>560005
Why does avast claim this has a virus?
>> Anonymous
>>560448
There's a really lenghty explanation about that in the user's guide, but in short it's because antivirus companies are lazy. They do a sloppy job at sampling the virus, which makes some perfectly good homebrew set off alerts. Heck, even mainstream apps like Flash Player was claimed to be virii once. I don't feel like argumenting the case much longer though, as some people are hellbent on blindly trusting their antivirus software. What you do is up to you, really.

Still, I just don't see why I would infect my biggest project to date with a trojan. Way too much work just "for the lulz".
>> Kou !!Bzfs+5yATPi
>>560448
Because Avast! is a whiny bitch.

Anyways, there is a section in the manual about virus claims.

http://praetox.com/site/software/pimgdb-manual-01.html
>> Anonymous
>>560461
Thanks; I suck at writing good/short summaries.
>> Anonymous
>>559976
A torrent should be made of your wallpapers.
>> Anonymous
IMAGE DUMP!!!
>> Anonymous
If you tag all the images before you dump them, I'll love you forever. Seriously.
>> Anonymous
>>560656
Do you really think that's going to happen? Enjoy your crushed dreams and manual labor by doing it yourself.
>> Anonymous
a torrent wold be a great idea.
>> Anonymous
TIME TO MEGAUPLOAD ^_^.

PLZ NO TORRENT.
>> Anonymous
Share them in any way. Torrent, RS, Megaupload, anything .. we would be forever grateful ^_^
>> Anonymous
post the script you are using...
>> Anonymous
you sir are a god
>> Anonymous
anyone know if OP /rs/ them yet?
>> Anonymous
>>560005
Holy crap, dude, you did an awesome looking job. One question, will you consider porting it to Linux? I'd rather keep my Windows partition Steam-only.
>> Anonymous
>>561315
Now that I've got a linux-only faptop, I'm tempted to do just that. I'll have a go at mono some day, but don't get your knickers in a twist just yet... I don't know whether 1GB of HDD space will be enough to play about with. inb4 get an external hdd - I'm on a very tight budget atm.

First I've got to get the project back into a working state, however. Visual Studio doesn't like the amount of buttons and shit in one window, so it crashes every time I try to load the project. I'll grab the opportunity to finally write a decent skinning engine for it... A fringe benefit will be an awesome level of customizationability (is that even a word?).

Nothing will probably happen in the nearest future though, as I'm painfully busy with rl stuff. Good thing the latest beta isn't too buggy.
>> Anonymous
>>561613
Awesome, hopefully you can get it working. Else I'll just have to use it under Windows. Good luck, and I you seek the word 'customizability'.
>> Anonymous
gah, 4chan ate my post
>> Anonymous
nononono dont get distracted by linux! we need the sorting by tags and size and inclusive/excluding through tags and improving the import and I'm sure a bunch of other stuff before you go worrying about porting it!

^_^
>> Anonymous
>>559986
any way I could get some info on that script of yours?
>> Anonymous
     File :-(, x)
>>562354
Right-o. I've been working on the skinning engine for the last couple of hours, and it shows - it's closing up to being finished. None of the created buttons actually do something, but that last part shouldn't be too hard to squeeze in.

As for the rest of your requests, they shouldn't be too hard to implement either - my only problem is my irregular schedule, so it's hard to get anything done at all. Good things come to those who wait, aye?

Pic related, it's finally starting to take shape (again).
>> Anonymous
>>562454
lol mnaual
>> Anonymous
>>562937
Well yeah, I was tired AND in a rush. It's fixed now... Or maybe I should keep it like that? Anyways, time to go back to that blasted economics.
>> Anonymous
so where the hell is this RAR/Torrent/ANYTHING?
>> Anonymous
>>559976
delete and start over.
>> Anonymous
What the shit happened here?
Neways, thanks for a great app>>560005.
>> Anonymous
>>560005
>Trojan.Win32.Delf.fcd

I see what you tried here
>> Anonymous
>>563420
see>>560461
>> Anonymous
>>563107
>> SimilarImages Anonymous
>>559993

http://celebnamer.celebworld.ws/similarimages/

Have used, does work.
>> Anonymous
I like Visipics helped me sort through gigs of pictures and you can have it be strict on seeing if something is very similar.
>> Anonymous
>>563420
false positives lol
>> Anonymous
waiting for rar
>> Anonymous
op is a tease
>> Anonymous
>>559976
>>559986

can you provide a link to the script you use?
that would be very helpful and awesome...=D
>> Anonymous
Upload them on Rapidshare so I can download em! :D<333
>> Anonymous
bump
>> Anonymous
cmon
>> Anonymous
i think he forgot about us
>> isac
i think i have only over 2000 pics
>> Anonymous
>>565378
goback2/g/ isac
>> Anonymous
;_;