+87
Planned

Option to remove duplicates from the tree

Kevin P 9 jaar geleden bijgewerkt door Adela Glez. 8 maand geleden 30

I have over 3000 tabs in my tree, lots of duplicates - it would be great to have an option to remove Duplicates. I'm currently using Tab Dupectomy but reopening a bigger amount of tabs causes Chrome or TO to crash.

+2
Or at least allow selection of more than one tab at once?
+10
I'd love to see a button that search for duplicate tabs (or what even often happens for me - duplicate nodes!) and delete the oldest tabs/nodes first, I really like this extension, but I usually don't remember about cleaning up the extension every chrome run, and after some time it piles up which in effect makes me waste A LOT of time on cleaning this up.
+2
With Tabs Outliner, hierarchy is important. I think it would be best if the extension didn't indiscriminately remove duplicates. Better would be to highlight them in the tree and allow the user to remove the preferred ones. 
+6
That's nice when you have few hundert tabs, not if you have few tousands. But it could work in 2 ways, highlighting the duplicates or simply "removing oldest/newest
"
+2
please add the feature of removing duplicates, the add on became now more then useless because it lacks this feature..
+2
Vladyslav, if you could describe the data structure you use to me, I would be happy to implement this, and you would own the (admittedly small) contribution. The precise, conservative behavior I want is this:

Click a new button, or perhaps a context menu item. I can show you how to easily enable a custom context menu using a lightweight jquery widget, if you're interested.
TO perhaps first warns how many tabs will be deleted with a simple ok/cancel popup.
For each distinct URL x that appears in the current session tree:
Delete every occurrence of x in a "crashed" or unnamed window before the most-recent occurrence. If a duplicate appears in a named window, leave it alone.
+3
well, just check the DOM structure. You can run any automation in the tree from the chrome developers console.

I plan a new release on the next week. It will not fix this problem meantime, but some next release shortly after it - will. The real solution is not in removing duplicated but not to have them in a first place
+1

When will you release the duplicate removal feature?

+1
You can also find the tree data in plain text (JSON format) in the File System folder of the Chrome Profile. Again, some Next version will allow to import such files back. It can be done even now, but the procedure is somewhat complex...

One other solution, and maybe better - export the tree in HTML through Ctrl-S. Programmatically deduplicate the file - and import it back in the fresh instance with empty tree by drag and dropping the hierarchies (next version will allow to import the whole tree by dragging the root node, current one have a bug that prevent this).

Hi


How can I find exactly this file on my mac chrome? ("You can also find the tree data in plain text (JSON format) in the File System folder of the Chrome Profile")


Is there any way to export all trees and nodes into a text file ?


Thank you

Thanks for the fast reply. I would've followed up sooner, but the email notification got marked "not important" by gmail.

If I add a copy of the DOM for your hover delete button in the appropriate place programmatically, and make a click event on the button, will the corresponding node in persistent storage be deleted? I realize I could test this in the console, but without ids on the nodes I don't see how to do so easily.

I know this topic is super old, but I created a super simple script to dedupe an exported tree backup. You can see it here: https://gist.github.com/jalaziz/03ecd04e44d3fc8bc393448c04e580ef

Hi Jameel

Would you please write a step by step for how to use your script?

Thank you !

+1

I still need to clean it up a bit to make it more usable. However, as of now you need to:

1. Export the current tree from Tabs Outliner (can be found on the Backup page in Settings).

2. Place the exported tree in the same directory as the script.

3. Change the name of the "export_file" variable in the script.

4. Run "python dedupe.py".

5. View the modified tree export file in Tabs Outliner (right under the button export the tree).

6. Optionally, delete all the nodes from the current Tabs Outliner session.

7. Copy over the root node to the Tabs Outliner tree for the current session.

Hi

Have you "cleaned it up a bit" as you wrote?

I have never used python, and not sure how to follow your instructions,

im on chrome mac, and the duplicates nodes in many trees (thousands from crashed sessions) make the extensions too slow to use, and its a shame i'll lose some of my saves and the extention as a result of all this duplicates..


Thanks!

+5

Instead of just removing all duplicate tabs from the tree... it would be more interesting, in my opinion, if the following happened instead:

1. I open a new tab that already exist somewhere in my TO tree,

2. TO detects this, and alerts me with how many duplicates there are, and also gives me the following options:

a. open this new tab as usual (the default)

b. open this new tab as usual, and remove all previous duplicates in the tree

c. instead of opening this new tab as usual, re-open the most recent instance in my tree

d. instead of opening this new tab as usual, re-open the most recent instance in my tree along with its original context (i.e. the whole window that it was in, with the other tabs)

+3

Here's a script to remove duplicates. There's also a list where you can put strings that, when found in the title of the tab, will cause it to be deleted, regardless of whether there are duplicates.


USAGE:

Bring TabsOutliner to the front.

control+shift+j (windows/linux) or command+shift+j (mac) to open the Developer Tools.

Click "Console"

Copy paste the code below into the line with the cursor. Press enter.


// if DON'T do this, could end up deleting whole subtrees when you only mean to delete the node
$('#expandAllButton').click();
first_occurrences = new Map();
urls_list = [];
urls_set = new Set();
tonodes = $$('#savedtabundefined');
tonodes.forEach( (node) => {
    let url = $('.nodeTitleContainer',node).href;
    urls_list.push(url);
    urls_set.add(url);
});
console.log(tonodes.length + " tabsoutliner tabs");
console.log(urls_list.length + " url occurrences");
console.log(urls_set.size + " distinct urls");
if(tonodes.length == urls_list.length) {    
    removed_dups = 0;
    kept = 0;    
    // keep the last occurrence of a url, and remove the rest 
    for(let i=urls_list.length-1; i>=0; i--) {
        let url = urls_list[i];
        if( first_occurrences.has(url) ) {
            removed_dups++;        
            tonodes[i].remove();
        }        
        else {
            kept++;
            first_occurrences.set(url,i);
        }
    }
    console.log("Removed " + removed_dups + " duplicates.") 
    console.log("Kept " + kept + " tabsoutliner tabs.");
}
else {
    console.error("Expected number of tabsoutliner nodes to be equal to number of url occurrences. This was false, so stopping to be safe.");
}
// undoes expand all
$('#expandAllButton').click();

Only removes duplicates based on exactly matching URLs. Keeps the most recent occurrence. 

+5

CORRECTED VERSION 

@Vladyslav are you able to remove the previous version? I can't.

function removeDups() {
    // if DON'T do this, could end up deleting whole subtrees when you only mean to delete the node
    $('#expandAllButton').click();
    urls_list = [];
    urls_set = new Set();
    tonodes = $$('#savedtabundefined');
    tonodes.forEach( (node) => {
        let url = $('.nodeTitleContainer',node).href;
        urls_list.push(url);
        urls_set.add(url);
    });
    console.log(tonodes.length + " tabsoutliner tabs");
    console.log(urls_list.length + " url occurrences");
    console.log(urls_set.size + " distinct urls");
    if(tonodes.length == urls_list.length) {    
        removed_dups = 0;
        kept = 0;    
        dups_with_children = 0
        // keep the last occurrence of a url, and remove the rest 
        last_occurrences = new Map();
        for(let i=urls_list.length-1; i>=0; i--) {
            let url = urls_list[i];
            if( last_occurrences.has(url) ) {
                if( $$('ul>li',tonodes[i]).length == 0 ) {
                    removed_dups++;
                    // tonodes[i].remove();
                }
                else {
                    dups_with_children++;
                    kept++;
                }
                // 
            }        
            else {
                kept++;
                last_occurrences.set(url,i);
            }
        }
        console.log(dups_with_children + " duplicates with children were NOT removed.");
        console.log(removed_dups + " duplicates with no children were removed.");
        console.log(kept + " tabsoutliner tabs were kept.");
    }
    else {
        console.error("Expected number of tabsoutliner nodes to be equal to number of url occurrences. This was false, so stopping to be safe.");
    }
    // undoes expand all
    $('#expandAllButton').click();
}
removeDups();
+1

Amazing! Apparently I had 47608 tabs outliner tabs.

17271 distinct urls.

1026 duplicates with children were NOT removed.

29311 duplicates with no children were removed.
18297 tabsoutliner tabs were kept


Kinda makes me wonder about the 1k unremoved, but damn it cleared nearly 30,000 nodes!

+1

That script was pretty brutal to my browser, removed tons of my tabs, fortunately I had a volume shadow copy from a restore point.


I think it detected that I have one URL twice, but once it's a parent node of like 90 others, and it killed the entire tree.

I'm sorry! There might be a race condition where the expand all button (which I thought would prevent that) doesn't work fast enough. 

I'll just modify it to never delete tabs with children. 

actually, the changes to the trees don't persist at all! the python script above is probably a better bet.

Hi guys, so what is the way to detect and remove duplicates leaving just one? I am uncertain whether those script work well. I wouldn't like to remove valuable stuff or entire trees without duplicates.

I just tried my hand at this, will only check top level nodes and remove unnecessary™ ones. Meaning if several top level nodes contain the very same tabs, while if there are differences, that's your problem to solve.

*this will not do a general-purpose deletion of tabs with the same URLs, it only compares entire top level nodes! — but having it the other way wouldn't be that difficult*

It will prefer keeping active windows, and otherwise keep the first one (since that usually has more intact favicons in my experience).

Let me know how it goes for you, and obviously don't be too shy to have a backup.

https://gist.github.com/Luckz/74cdd60a795e72f677c0db5c660a4922

P.S.: root nodes that are not windows/groups should not be affected, so silly free-floating top-level separator lines should survive.

P.P.S.: A built in 'reconciliation' process would of course still be vastly more complicated and powerful. 

Thanks! When you say it will only check top leve trees, what do you mean? It doesnt check all links independently of whether they are top or bottom?

+1

It does not compare tab links to each other at all, but it compares Windows/Groups* by treating them as the sum of their links (in order).

Top level meaning immediate children (sub-nodes) of "Current Session", *not* nested windows that are inside Groups or live as children of TextNotes or something.

For example, let's say you have a saved window with three tabs: A, B, C, and an unloaded tab D.

Now your browser / PC crashes, and you use Chrome's crash restore.

T.O. will (I think) show you a green-styled crashed window and another normal blue/grey window. One of them has [A,B,C] (that were loaded at the time of the crash and restored), the other will have [A,B,C,D].

My script would not do anything, because [A,B,C] and [A,B,C,D] are different.

A script that compares tabs to other tabs and removes tabs with already-seen URLs might leave you with [A,B,C] and [D], which is kind of stupid too.

To do what I call reconciliation, the extension author would have to give descriptors to each window that persist across restarts, crashes, etc, and then intelligently merge while keeping nesting information and custom names and all of those.

If all tabs in a window are loaded and you simply quit & restart Chrome and restore the session, this will create an identical duplicate (with loss of nesting) and my script will keep only one copy.

*: TO's "groups" are also windows, they just have a different icon ^_^

You are right, this is the approach that makes sense. Thank for such clear explanation!